Hubert Tong [Mon, 5 Aug 2019 13:55:41 +0000 (13:55 +0000)]
[yaml2obj][tests] Fix overly restrictive od output check
Summary:
rL364517 introduced further instances of `od` output checking of the
kind previously corrected by rL363829. This patch corrects the issue by
suppressing output of the input offset. The check remains sufficiently
sensitive to test for the intended value of the specific byte since the
relevant byte value is the only output we are expecting from `od`.
Cullen Rhodes [Mon, 5 Aug 2019 13:44:10 +0000 (13:44 +0000)]
[AArch64] Implement initial SVE calling convention support
Summary:
This patch adds initial support for the SVE calling convention such that
SVE types can be passed as arguments and return values to/from a
subroutine.
The SVE AAPCS states [1]:
z0-z7 are used to pass scalable vector arguments to a subroutine,
and to return scalable vector results from a function. If a
subroutine takes arguments in scalable vector or predicate
registers, or if it is a function that returns results in such
registers, it must ensure that the entire contents of z8-z23 are
preserved across the call. In other cases it need only preserve the
low 64 bits of z8-z15, as described in ยง5.1.2.
p0-p3 are used to pass scalable predicate arguments to a subroutine
and to return scalable predicate results from a function. If a
subroutine takes arguments in scalable vector or predicate
registers, or if it is a function that returns results in these
registers, it must ensure that p4-p15 are preserved across the call.
In other cases it need not preserve any scalable predicate register
contents.
SVE predicate and data registers are passed indirectly (i.e. spilled to the
stack and pass the address) if they exceed the registers used for argument
passing defined by the PCS referenced above. Until SVE stack support is merged
we can't spill SVE registers to the stack, so currently an llvm_unreachable is
used where we will eventually handle this.
Hans Wennborg [Mon, 5 Aug 2019 13:04:12 +0000 (13:04 +0000)]
test-release.sh: Perform the sed substitution on both files (PR42739)
The comparison would otherwise fail if Phase2 occurrs naturally in the
object file. It would get replaced with Phase3 in the one .o, but not
in the other.
We were already running both files through sed to have them processed in
this same way; this is a logical extension of that.
Sanjay Patel [Mon, 5 Aug 2019 11:27:07 +0000 (11:27 +0000)]
[DAGCombiner][x86] prevent infinite loop from truncate/extend transforms
The test case is based on the example from the post-commit thread for:
https://reviews.llvm.org/rGc9171bd0a955
This replaces the x86-specific simple-type check from:
rL367766
with a check in the DAGCombiner. Adding the check isn't
strictly necessary after the fix from:
rL367768
...but it seems likely that we're heading for trouble if
we are creating weird types in this transform.
I combined the earlier legality check into the initial
clause to simplify the code.
So we should only try the trunc/sext transform at the
earliest combine stage, but we limit the transform to
simple types anyway because the TLI hook is probably
too lax about what it considers a free truncate.
This method is dead. It was introduced in D47989,
but now the logic from D63475 is used in llvm-readobj instead.
Also it has a problem: it returns the first matching section,
even if there are multiple sections with the same name.
Summary:
This is patch is part of a serie to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
[LLVM][Alignment] Introduce Alignment In CallingConv
Summary:
This is patch is part of a serie to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Oliver Stannard [Mon, 5 Aug 2019 09:04:10 +0000 (09:04 +0000)]
Reland: Fix and test inter-procedural register allocation for ARM
Add an explicit construction of the ArrayRef, gcc 5 and earlier don't
seem to select the ArrayRef constructor which takes a C array when the
construction is implicit.
Original commit message:
- Avoid a crash when IPRA calls ARMFrameLowering::determineCalleeSaves
with a null RegScavenger. Simply not updating the register scavenger
is fine because IPRA only cares about the SavedRegs vector, the acutal
code of the function has already been generated at this point.
- Add a new hook to TargetRegisterInfo to get the set of registers which
can be clobbered inside a call, even if the compiler can see both
sides, by linker-generated code.
[LLVM][Alignment] Introduce Alignment Type in DataLayout
Summary:
This is patch is part of a serie to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Michael Pozulp [Mon, 5 Aug 2019 08:52:28 +0000 (08:52 +0000)]
Revert "[llvm-objdump] Re-commit r367284."
This reverts r367776 (git commit d34099926e909390cb0254bebb4b7f5cf15467c7).
My changes to llvm-objdump tests caused them to fail on windows:
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/27368
Craig Topper [Mon, 5 Aug 2019 03:48:31 +0000 (03:48 +0000)]
[X86] Fix a bad early out in combineExtInVec that prevented recursive shuffle combining from running with -x86-experimental-vector-widening-legalization.
Summary:
This contains various fixes:
- Explicitly determine and return the next noreturn instruction.
- If an invoke calls a noreturn function which is not nounwind we
keep the unwind destination live. This also means we require an
invoke. Though we can still add the unreachable to the normal
destination block.
- Check if the return instructions are dead after we look for calls
to avoid triggering an optimistic fixpoint in the presence of
assumed liveness information.
- Make the interface work with "const" pointers.
- Some simplifications
While additional tests are included, full coverage is achieved only with
D59978.
[Attributor][NFC] Simplify common pattern wrt. fixpoints
When a fixpoint is indicated the change status is known due to the
fixpoint kind. This simplifies a common code pattern by making the
connection explicit.
[Attributor][NFC] Invalid DerefState is at fixpoint
Summary:
If the DerefBytesState (and thereby the DerefState) is invalid, we
reached a fixpoint for the whole DerefState as we will not
manifest/provide information then.
Craig Topper [Sun, 4 Aug 2019 17:30:41 +0000 (17:30 +0000)]
[TargetLowering][X86] Teach SimplifyDemandedVectorElts to replace the base vector of INSERT_SUBVECTOR with undef if none of the elements are demanded even if the node has other users.
Summary:
The SimplifyDemandedVectorElts function can replace with undef
when no elements are demanded, but due to how it interacts with
TargetLoweringOpts, it can only do this when the node has
no other users.
Remove a now unneeded DAG combine from the X86 backend.
David Green [Sun, 4 Aug 2019 10:18:15 +0000 (10:18 +0000)]
[ARM] MVE big endian bitcasts
This adds big endian MVE patterns for bitcasts. They are defined in llvm as
being the same as a store of the existing type and the load into the new. This
means that they have to become a VREV between the two types, working in the
same way that NEON works in big-endian. This also adds some example tests for
bigendian, showing where code is and isn't different.
The main difference, especially from a testing perspective is that vectors are
passed as v2f64, and so are VREV into and out of call arguments, and the
parameters are passed in a v2f64 format. Same happens for inline assembly where
the register class is used, so it is VREV to a v16i8.
So some of this is probably not correct yet, but it is (mostly) self-consistent
and seems to be consistent with how llvm treats vectors. The rest we can
hopefully fix later. More details about big endian neon can be found in
https://llvm.org/docs/BigEndianNEON.html.
Yonghong Song [Sat, 3 Aug 2019 23:41:26 +0000 (23:41 +0000)]
[Transforms] Do not drop !preserve.access.index metadata
Currently, when a GVN or CSE optimization happens,
the llvm.preserve.access.index metadata is dropped.
This caused a problem for BPF AbstructMemberOffset phase
as it relies on the metadata (debuginfo types).
This patch added proper hooks in lib/Transforms to
preserve !preserve.access.index metadata. A test
case is added to ensure metadata is preserved under CSE.
Sanjay Patel [Sat, 3 Aug 2019 21:46:27 +0000 (21:46 +0000)]
[x86] change free truncate hook to handle only simple types (PR42880)
This avoids the crash from:
https://bugs.llvm.org/show_bug.cgi?id=42880
...and I think it's a proper constraint for the TLI hook.
But that example raises questions about what happens to get us
into this situation (created i29 types) and what happens later
(why does legalization die on those types), so I'm not sure if
we will resolve the bug based on this change.
Keno Fischer [Sat, 3 Aug 2019 21:38:19 +0000 (21:38 +0000)]
[WebAssembly] Fix allocsize attribute in sjlj lowering
Summary:
The allocsize attribute refers to call parameters by index.
Thus, when we add the extra parameter in sjlj lowering, we
need to increment the referenced paramater in the allocsize
attribute to avoid angering the Verifier.
Lang Hames [Sat, 3 Aug 2019 20:17:10 +0000 (20:17 +0000)]
[JITLink] Add support for MachO/x86-64 UNSIGNED relocs with length=2.
MachO/x86-64 UNSIGNED relocs are almost always 64-bit (length=3), but UNSIGNED
relocs of length=2 are allowed if the target resides in the low 32-bits. This
patch adds support for such relocations in JITLink (previously they would have
triggered an unsupported relocation error).
Hubert Tong [Sat, 3 Aug 2019 18:52:45 +0000 (18:52 +0000)]
[yaml2obj][tests] Replace 8-byte `od` conversion with 1-byte conversion
Summary:
`od` on AIX does not seem to implement 8-byte integer conversions. Work
around this by using 1-byte conversions, which can be used in this case
since the value is byte-order insensitive.
Nikita Popov [Sat, 3 Aug 2019 06:47:23 +0000 (06:47 +0000)]
[Thumb] Fix invalid symbol redefinition due to duplicated jumptable (PR42760)
Fix for https://bugs.llvm.org/show_bug.cgi?id=42760. A tBR_JTr
instruction is duplicated by tail duplication, which results in
the same jumptable with the same label being emitted twice.
Fix this by marking tBR_JTr as not duplicable. The corresponding
ARM/Thumb instructions are already marked as not duplicable.
Additionally also mark tTBB_JT and tTBH_JT to be consistent with
Thumb2, even though this shouldn't be strictly necessary.
Joel E. Denny [Sat, 3 Aug 2019 06:08:19 +0000 (06:08 +0000)]
[lit] Print internal env commands
Without this patch, the internal `env` command removes `env` and its
args from the command line while parsing it. This patch modifies a
copy instead so that the original command line is printed.
Bill Wendling [Sat, 3 Aug 2019 05:52:47 +0000 (05:52 +0000)]
Emit diagnostic if an inline asm constraint requires an immediate
Summary:
An inline asm call can result in an immediate after inlining. Therefore emit a
diagnostic here if constraint requires an immediate but one isn't supplied.
Craig Topper [Sat, 3 Aug 2019 02:54:54 +0000 (02:54 +0000)]
[InstSimplify] Add test case to show bad sign bit handling for integer abs idiom in computeKnownBits.
computeKnownBits will indicate the sign bit of abs is 0 if the
the RHS operand returned by matchSelectPattern has the nsw flag set.
For abs idioms like (X >= 0) ? X : -X, the RHS returns -X. But
we can also match ((X-Y) >= 0 ? X-Y : Y-X as abs. In this case
RHS will be the Y-X operand. According to Alive, the sign bit for
this is only 0 if both the X-Y and Y-X operands have the nsw flag.
But we're only checking the Y-X operand.
Amara Emerson [Fri, 2 Aug 2019 23:44:24 +0000 (23:44 +0000)]
Re-commit "[GlobalISel] Add legalization support for non-power-2 loads and stores""
This is an old commit that exposed a bug in the GISel importer, which caused
non-truncating stores to be selected for truncating store patterns. Now that's
been fixed in r367737 this can go back in.
Craig Topper [Fri, 2 Aug 2019 23:43:53 +0000 (23:43 +0000)]
[ScalarizeMaskedMemIntrin] Bitcast the mask to the scalar domain and use scalar bit tests for the branches for expandload/compressstore.
Same as what was done for gather/scatter/load/store in r367489.
Expandload/compressstore were delayed due to lack of constant
masking handling that has since been fixed.
Yonghong Song [Fri, 2 Aug 2019 23:16:44 +0000 (23:16 +0000)]
[BPF] Handling type conversions correctly for CO-RE
With newly added debuginfo type
metadata for preserve_array_access_index() intrinsic,
this patch did the following two things:
(1). checking validity before adding a new access index
to the access chain.
(2). calculating access byte offset in IR phase
BPFAbstractMemberAccess instead of when BTF is emitted.
For (1), the metadata provided by all preserve_*_access_index()
intrinsics are used to check whether the to-be-added type
is a proper struct/union member or array element.
For (2), with all available metadata, calculating access byte
offset becomes easier in BPFAbstractMemberAccess IR phase.
This enables us to remove the unnecessary complexity in
BTFDebug.cpp.
New tests are added for
. user explicit casting to array/structure/union
. global variable (or its dereference) as the source of base
. multi demensional arrays
. array access given a base pointer
. cases where we won't generate relocation if we cannot find
type name.
[lit] Fix 42812: lit test suite can no longer be run stand-alone
Summary:
This change updates the lit.cfg file to use llvm_config when it is available, but when it is not, it directly modifies the config object. This makes it possible to run the lit tests standalone without having built llvm (as long as the correct binaries are present in the path such as FileCheck and not).
Because the lit tests don't take a hard dependency on llvm_config, some features such as system-windows have to have definitions in lit's cfg file as well. This is a potential issue as the os features sometimes change names (for example, we went from windows to system-windows, etc.). This can cause drift between lit's tests and the rest of the llvm tests.
Yonghong Song [Fri, 2 Aug 2019 21:28:28 +0000 (21:28 +0000)]
[BPF] annotate DIType metadata for builtin preseve_array_access_index()
Previously, debuginfo types are annotated to
IR builtin preserve_struct_access_index() and
preserve_union_access_index(), but not
preserve_array_access_index(). The debug info
is useful to identify the root type name which
later will be used for type comparison.
For user access without explicit type conversions,
the previous scheme works as we can ignore intermediate
compiler generated type conversions (e.g., from union types to
union members) and still generate correct access index string.
The issue comes with user explicit type conversions, e.g.,
converting an array to a structure like below:
struct t { int a; char b[40]; };
struct p { int c; int d; };
struct t *var = ...;
... __builtin_preserve_access_index(&(((struct p *)&(var->b[0]))->d)) ...
Although BPF backend can derive the type of &(var->b[0]),
explicit type annotation make checking more consistent
and less error prone.
Another benefit is for multiple dimension array handling.
For example,
struct p { int c; int d; } g[8][9][10];
... __builtin_preserve_access_index(&g[2][3][4].d) ...
It would be possible to calculate the number of "struct p"'s
before accessing its member "d" if array debug info is
available as it contains each dimension range.
This patch enables to annotate IR builtin preserve_array_access_index()
with proper debuginfo type. The unit test case and language reference
is updated as well.
Signed-off-by: Yonghong Song <yhs@fb.com>
Differential Revision: https://reviews.llvm.org/D65664
Amara Emerson [Fri, 2 Aug 2019 21:15:36 +0000 (21:15 +0000)]
[AArch64][GlobalISel] Eliminate redundant G_ZEXT when the source is implicitly zext-loaded.
These cases can come up when the extending loads combiner doesn't combine a
zext(load) to a zextload op, due to some other operation being in between, which
then gets simplified at a later stage.
Philip Reames [Fri, 2 Aug 2019 20:17:37 +0000 (20:17 +0000)]
[Statepoints] Fix overalignment of loads in no-realign-stack functions
This really should have been part of 366765. For some reason, I forgot to handle the corresponding load side, and the readable test cases (using deopt vs statepoints) turned out to be overly reduced. Oops.
As seen in the test change, the problem was that we were using a load with alignment expectations rather than the unaligned variant when the stack alignment was less than that prefered type alignment.
Lang Hames [Fri, 2 Aug 2019 19:43:20 +0000 (19:43 +0000)]
[ORC] Turn on symbol-flags overrides for LLJIT on Windows by default.
libObject does not apply the Exported flag to symbols in COFF object files,
which can lead to assertions when the symbol flags initially derived from
IR added to the JIT clash with the flags seen by the JIT linker. Both
RTDyldObjectLinkingLayer and ObjectLinkingLayer have a workaround for this:
they can be told to override the flags seen by the linker with the flags
attached to the materialization responsibility object that was passed down
to the linker. This patch modifies LLJIT's setup code to enable this override
by default on platforms where COFF is the default object format.
Sanjay Patel [Fri, 2 Aug 2019 19:33:46 +0000 (19:33 +0000)]
[DAGCombiner] try to convert opposing shifts to casts
This reverses a questionable IR canonicalization when a truncate
is free:
sra (add (shl X, N1C), AddC), N1C -->
sext (add (trunc X to (width - N1C)), AddC')
https://rise4fun.com/Alive/slRC
More details in PR42644:
https://bugs.llvm.org/show_bug.cgi?id=42644
I limited this to pre-legalization for code simplicity because that
should be enough to reverse the IR patterns. I don't have any
evidence (no regression test diffs) that we need to try this later.
Alina Sbirlea [Fri, 2 Aug 2019 18:37:03 +0000 (18:37 +0000)]
[NewPassManager] Resolve assertion in CGSCCPassManager when CallCounts change.
Summary:
If the CallCounts change after an iteration of the DevirtSCCRepeatedPass, this is not reflected in the local CallCounts structure triggering the assertion checking the before/after sizes.
Since it is valid for the size to change and this only uses the CallCounts for the devirtualizing heuristic, keep a <Function*, CallCount> map instead, and make the devirtualizing decision using the counts for the functions that exist both before and after the pass.
Alina Sbirlea [Fri, 2 Aug 2019 18:06:54 +0000 (18:06 +0000)]
[SimplifyCFG] Cleanup redundant conditions [NFC].
Summary:
Since the for loop iterates over BB's predecessors, the branch conditions found must have BB as one of the successors.
For an unconditional branch the successor must be BB, added `assert`.
For a conditional branch, one of the two successors must be BB, simplify `else if` to `else` and `assert`.
Sink common instructions outside the if/else block.