[X86][AVX] Ensure chained subvector insertions are the same size (PR42833)
Before combining insert_subvector(insert_subvector(vec, sub0, c0), sub1, c1) patterns, ensure that the subvectors are all the same type. On AVX512 targets especially we might have a mixture of 128/256 subvector insertions.
------------------------------------------------------------------------
Commit https://reviews.llvm.org/D57939 ("[DWARF] Refactor
RelocVisitor and fix computation of SHT_RELA-typed relocation entries)
made a change for relocation resolution when operating
on an object file.
The change unfortunately broke BPF as given SymbolValue (S) and
Addent (A), previously relocation is resolved to
S + A
and after the change, it is resolved to
S
This patch fixed the issue by resolving relocation correctly.
It looks not all relocation resolution reaches here and I did not
trace down exactly when. But I do find if the object file includes
codes in two different ELF sections than default ".text",
the above bug will be triggered.
This patch included a trivial two function source code to
demonstrate this issue. The relocation for .debug_loc is resolved
incorrectly due to this and llvm-objdump cannot display source
annotated assembly.
If OptimizeExtractBits() encountered a shift instruction with no operands at all,
it would erase the instruction, but still return false.
This previously didn’t matter because its caller would always return after
processing the instruction, but https://reviews.llvm.org/D63233 changed the
function’s caller to fall through if it returned false, which would then cause
a use-after-free detectable by ASAN.
This change makes OptimizeExtractBits return true if it removes a shift
instruction with no users, terminating processing of the instruction.
[DebugInfo] Avoid crash from dropped fragments in LiveDebugValues
This patch avoids a crash caused by DW_OP_LLVM_fragments being dropped
from DIExpressions by LiveDebugValues spill-restore code. The appearance
of a previously unseen fragment configuration confuses LDV, as documented
in PR42773, and reproduced by the test function this patch adds (Crashes
on a x86_64 debug build).
To avoid this, on spill restore, we now use fragment information from the
spilt-location-expression.
In addition, when spilling, we now don't spill any DBG_VALUE with a complex
expression, as it can't be safely restored and will definitely lead to an
incorrect variable location. The discussion of this is in D65368.
[cmake] install_symlink should obey DESTDIR unconditionally
Setting DESTDIR was erroneously buried under a condition here - if
it's set it should always be used.
------------------------------------------------------------------------
[AArch64][SVE] Allow explicit size specifier for predicate operand
... for the vector forms of `{SQ,UQ,}{INC,DEC}P` instructions. Also continue
supporting the exsting behaviour of not requiring an explicit size
specifier. The preferred disasembly is *with* the specifier.
This is implemented by redefining intruction forms to require vector predicates
with explicit size and adding aliases, which allow a predicate with no size.
Summary:
Currently the RISC-V backend does not realign the stack. This can be an issue even for the RV32I/RV64I ABIs (where the stack is 16-byte aligned), though is rare. It will be much more comment with RV32E (though the alignment requirements for common data types remain under-documented...).
This patch adds minimal support for stack realignment. It should cope with large realignments. It will error out if the stack needs realignment and variable sized objects are present.
It feels like a lot of the code like getFrameIndexReference and determineFrameLayout could be refactored somehow, as right now it feels fiddly and brittle. We also seem to allocate a lot more memory than GCC does for equivalent C code.
Summary:
A block address may be used in inline assembly. In which case it
requires a name so that the asm parser has something to parse. Creating
a name for every block address is a large hammer, but is necessary
because at the point when a temp symbol is created we don't necessarily
know if it's used in inline asm. This ensures that it exists regardless.
Summary:
This patch keeps track of MCSymbols created for blocks that were
referenced in inline asm. It prevents creating a new symbol which
doesn't refer to the block.
Inline asm may have a reference to a label. The asm parser however
doesn't recognize it as a label and tries to create a new symbol. The
result being that instead of the original symbol (e.g. ".Ltmp0") the
parser replaces it in the inline asm with the new one (e.g. ".Ltmp00")
without updating it in the symbol table. So the machine basic block
retains the "old" symbol (".Ltmp0"), but the inline asm uses the new one
(".Ltmp00").
Summary:
Ana Pazos reported a bug where we were not checking that an APInt would
fit into 64-bits before calling `getSExtValue()`. This caused asserts when
compiling large constants, such as i128s, as happens when compiling compiler-rt.
This patch adds a testcase and makes the callback less error-prone.
Hans Wennborg [Tue, 13 Aug 2019 11:56:03 +0000 (11:56 +0000)]
Merging r368517, r368518, r368519, and r368554:
------------------------------------------------------------------------
r368517 | lebedevri | 2019-08-10 21:28:12 +0200 (Sat, 10 Aug 2019) | 1 line
[NFC][InstCombine] Tests for shift amount reassociation in bittest with shift of const
------------------------------------------------------------------------
[InstCombine] Shift amount reassociation in bittest: drop pointless one-use restriction
That one-use restriction is not needed for correctness - we have already
ensured that one of the shifts will go away, so we know we won't increase
the instruction count. So there is no need for that restriction.
------------------------------------------------------------------------
[InstCombine] Shift amount reassociation in bittest: relax one-use check when shifting constant
If one of the values being shifted is a constant, since the new shift
amount is known-constant, the new shift will end up being constant-folded
so, we don't need that one-use restriction then.
------------------------------------------------------------------------
Recommit "[MS] Emit S_HEAPALLOCSITE debug info in Selection DAG"
with a fix to clear the SDNode map when SelectionDAG is cleared.
------------------------------------------------------------------------
[X86] Make CMPXCHG16B feature imply CMPXCHG8B feature.
This fixes znver1 so that it properly enables CMPXHG8B. We can
probably remove explicit CMPXCHG8B from CPUs that also have
CMPXCHG16B, but keeping this simple to allow cherry pick to 9.0.
Emit diagnostic if an inline asm constraint requires an immediate
Summary:
An inline asm call can result in an immediate after inlining. Therefore emit a
diagnostic here if constraint requires an immediate but one isn't supplied.
Summary:
This adds the 'f' inline assembly constraint, as supported by GCC. An
'f'-constrained operand is passed in a floating point register. Exactly
which kind of floating-point register (32-bit or 64-bit) is decided
based on the operand type and the available standard extensions (-f and
-d, respectively).
This patch adds support in both the clang frontend, and LLVM itself.
lit: Use a License classifier that pypi will accept
Summary:
'OSI Approved :: Apache-2.0 with LLVM exception' is not a valid
classifier. 'OSI Approved :: Apache Software License' is the closest
fit for the new license, so we've decided to use this one.
The classifiers seem to only be used for searching on the pypi website,
so this does not actually change the license of the code.
We still pass 'Apache-2.0 with LLVM exception' as the license to setup(),
and this appears alongside the classifier on the pypi webpage for lit.
Patch D56593 by @courbet results in calls to `bcmp()` in some cases, should
the target support the it. Unless `TTI::MemCmpExpansionOptions()`
is overridden by the target.
In a proprietary benchmark we see a performance drop of about 12% on PNG
compression before this patch, though it passes all tests.
This patch mirrors X86 for AArch64 and initializes
`TTI::MemCmpExpansionOptions()` to then expand calls to `bcmp()` when
appropriate. No tuning of the parameters was performed, but, at this point,
it's enough to recover the performance drop above.
This problem also exists on ARM. Once a consensus is reached for AArch64, we
can work to fix ARM as well.
Add a note to the release not about a potentially breaking optimization
This has come up twice already (once in pr42763 and once in the commit thread), so give warning of a new way in which UB can result in unexpected program behavior.
Hans Wennborg [Tue, 6 Aug 2019 08:18:29 +0000 (08:18 +0000)]
Merge r367730 for PR42812
> [lit] Fix 42812: lit test suite can no longer be run stand-alone
>
> Summary:
> This change updates the lit.cfg file to use llvm_config when it is available, but when it is not, it directly modifies the config object. This makes it possible to run the lit tests standalone without having built llvm (as long as the correct binaries are present in the path such as FileCheck and not).
>
> Because the lit tests don't take a hard dependency on llvm_config, some features such as system-windows have to have definitions in lit's cfg file as well. This is a potential issue as the os features sometimes change names (for example, we went from windows to system-windows, etc.). This can cause drift between lit's tests and the rest of the llvm tests.
>
> Reviewers: probinson, mgorny
>
> Reviewed By: mgorny
>
> Subscribers: delcypher, llvm-commits, asmith
>
> Tags: #llvm
>
> Differential Revision: https://reviews.llvm.org/D65674
[X86] SimplifyDemandedVectorEltsForTargetNode - Move SUBV_BROADCAST narrowing handling. NFCI.
Move the narrowing of SUBV_BROADCAST to where we handle all the other opcodes.
------------------------------------------------------------------------
[X86][AVX] SimplifyDemandedVectorElts - handle extraction from X86ISD::SUBV_BROADCAST source (PR42819)
PR42819 showed an issue that we couldn't handle the case where we demanded a 'sub-sub-vector' of the SUBV_BROADCAST 'sub-vector' source.
This patch recognizes these cases and extracts the sub-sub-vector instead of trying to broadcast to a type smaller than the 'sub-vector' source.
------------------------------------------------------------------------
Hans Wennborg [Mon, 5 Aug 2019 13:10:20 +0000 (13:10 +0000)]
Merging r367846 and r367847:
------------------------------------------------------------------------
r367846 | hans | 2019-08-05 15:04:07 +0200 (Mon, 05 Aug 2019) | 1 line
Write the RequiredLibraries for 'all' in LibraryDependencies.inc in a deterministic order (PR42739)
------------------------------------------------------------------------
------------------------------------------------------------------------
r367847 | hans | 2019-08-05 15:04:12 +0200 (Mon, 05 Aug 2019) | 8 lines
test-release.sh: Perform the sed substitution on both files (PR42739)
The comparison would otherwise fail if Phase2 occurrs naturally in the
object file. It would get replaced with Phase3 in the one .o, but not
in the other.
We were already running both files through sed to have them processed in
this same way; this is a logical extension of that.
------------------------------------------------------------------------
[AliasAnalysis] Initialize a member variable that may be used by unit test.
The unit tests in BasicAliasAnalysisTest use the alias analysis API
directly and do not call setAAResults to initalize AAR. This gives a
valgrind error "Conditional Jump depends on unitialized variable".
On most buildbots the variable is nullptr, but in some cases it can be
non nullptr leading to seemingly random failures.
These tests were disabled in r366986. With the initialization they can be
enabled again.
[Thumb] Fix invalid symbol redefinition due to duplicated jumptable (PR42760)
Fix for https://bugs.llvm.org/show_bug.cgi?id=42760. A tBR_JTr
instruction is duplicated by tail duplication, which results in
the same jumptable with the same label being emitted twice.
Fix this by marking tBR_JTr as not duplicable. The corresponding
ARM/Thumb instructions are already marked as not duplicable.
Additionally also mark tTBB_JT and tTBH_JT to be consistent with
Thumb2, even though this shouldn't be strictly necessary.
Summary:
`DivRemPairs` internally creates two maps:
* {sign, divident, divisor} -> div instruction
* {sign, divident, divisor} -> rem instruction
Then it iterates over rem map, and looks if there is an entry
in div map with the same key. Then depending on some internal logic
it may RAUW rem instruction with something else.
But if that rem instruction is an input to other div/rem,
then it was used as a key in these maps, so the old value (used in key)
is now dandling, because RAUW didn't update those maps.
And we can't even RAUW map keys in general, there's `ValueMap`,
but we don't have a single `Value` as key...
The bug was discovered via D65298, and the test there exists.
Now, i'm not sure how to expose this issue in trunk.
The bug is clearly there if i change the map keys to be `AssertingVH`/`PoisoningVH`,
but i guess this didn't miscompiled anything thus far?
I really don't think this is benin without that patch.
The fix is actually rather straight-forward - instead of trying to somehow
shoe-horn `ValueMap` here (doesn't fit, key isn't just `Value`), or writing a new
`ValueMap` with key being a struct of `Value`s, we can just have an intermediate
data structure - a vector, each entry containing matching `Div, Rem` pair,
and pre-filling it before doing any modifications.
This way we won't need to query map after doing RAUW, so no bug is possible.
This is the compantion patch to https://reviews.llvm.org/D64482, needed to ensure
that builds with host compilers that don't yet predefine _FILE_OFFSET_BITS=64 on
Solaris succeed by always making the host and freshly built clang consistent.
[AArch64][SVE2] Use destination register as source register
Summary:
This patch fixes a bug in the following instructions that should have been
implemented as destructive. A destructive instruction is an instruction where
one of the source registers also acts as the destination register. Therefore,
the contents of the source register, when the instruction begins execution, are
replaced by the result of the instruction when the instruction completes
execution [1]:
Summary:
* Clarify comment with SVE2 for predicated shifts and move next to other
shift instructions.
* Clarify comments for various instructions.
* Move FCVTX instruction next to other fp conversions.
* Move FLOGB to next to other fp instructions and fix description.
* Remove "cons" from non-constructive multiclass for bitwise shift-right
and accumulate instructions.
Summary:
* Loads and stores in SVE2 are gather/scatter not contiguous, fixed by
renaming multiclasses to reflect this and also updated comments.
* Remove aliases from load/store multiclasses that reflect the behaviour
of the original form.
* Fix bug in scatter store implementation, vector list should be used as
input, not output.
Mark test/MC/RISCV/rv{32,64}i-aliases-invalid.s unsupported also on Windows
Because they fail there too.
FAIL: LLVM :: MC/RISCV/rv32i-aliases-invalid.s (24397 of 32659)
******************** TEST 'LLVM :: MC/RISCV/rv32i-aliases-invalid.s' FAILED ********************
Script:
--
: 'RUN: at line 2'; not c:\src\llvm.monorepo\build.release2\bin\llvm-mc.exe C:\src\llvm.monorepo\llvm\test\MC\RISCV\rv32i-aliases-invalid.s -triple=riscv32 -riscv-no-aliases 2>&1 | c:\src\llvm.monorepo\build.release2\bin\filecheck.exe C:\src\llvm.monorepo\llvm\test\MC\RISCV\rv32i-aliases-invalid.s
: 'RUN: at line 3'; not c:\src\llvm.monorepo\build.release2\bin\llvm-mc.exe C:\src\llvm.monorepo\llvm\test\MC\RISCV\rv32i-aliases-invalid.s -triple=riscv32 2>&1 | c:\src\llvm.monorepo\build.release2\bin\filecheck.exe C:\src\llvm.monorepo\llvm\test\MC\RISCV\rv32i-aliases-invalid.s
--
Exit Code: 1
Command Output (stdout):
--
$ ":" "RUN: at line 2"
$ "not" "c:\src\llvm.monorepo\build.release2\bin\llvm-mc.exe" "C:\src\llvm.monorepo\llvm\test\MC\RISCV\rv32i-aliases-invalid.s" "-triple=riscv32" "-riscv-no-aliases"
$ "c:\src\llvm.monorepo\build.release2\bin\filecheck.exe" "C:\src\llvm.monorepo\llvm\test\MC\RISCV\rv32i-aliases-invalid.s"
C:\src\llvm.monorepo\llvm\test\MC\RISCV\rv32i-aliases-invalid.s:10:21: error: CHECK: expected string not found in input
li t4, foo # CHECK: :[[@LINE]]:8: error: immediate must be an integer in the range [-2147483648, 4294967295]
^
<stdin>:5:1: note: scanning from here
li x0, -2147483649 # CHECK: :[[@LINE]]:8: error: immediate must be an integer in the range [-2147483648, 4294967295]
^
<stdin>:5:1: note: with "@LINE" equal to "10"
li x0, -2147483649 # CHECK: :[[@LINE]]:8: error: immediate must be an integer in the range [-2147483648, 4294967295]
^
<stdin>:5:38: note: possible intended match here
li x0, -2147483649 # CHECK: :[[@LINE]]:8: error: immediate must be an integer in the range [-2147483648, 4294967295]
^
error: command failed with exit status: 1
--
--
********************
Testing: 0 .. 10.. 20.. 30.. 40.. 50.. 60.. 70
FAIL: LLVM :: MC/RISCV/rv64i-aliases-invalid.s (24416 of 32659)
******************** TEST 'LLVM :: MC/RISCV/rv64i-aliases-invalid.s' FAILED ********************
Script:
--
: 'RUN: at line 2'; not c:\src\llvm.monorepo\build.release2\bin\llvm-mc.exe C:\src\llvm.monorepo\llvm\test\MC\RISCV\rv64i-aliases-invalid.s -triple=riscv64 -riscv-no-aliases 2>&1 | c:\src\llvm.monorepo\build.release2\bin\filecheck.exe C:\src\llvm.monorepo\llvm\test\MC\RISCV\rv64i-aliases-invalid.s
: 'RUN: at line 3'; not c:\src\llvm.monorepo\build.release2\bin\llvm-mc.exe C:\src\llvm.monorepo\llvm\test\MC\RISCV\rv64i-aliases-invalid.s -triple=riscv64 2>&1 | c:\src\llvm.monorepo\build.release2\bin\filecheck.exe C:\src\llvm.monorepo\llvm\test\MC\RISCV\rv64i-aliases-invalid.s
--
Exit Code: 1
Command Output (stdout):
--
$ ":" "RUN: at line 2"
$ "not" "c:\src\llvm.monorepo\build.release2\bin\llvm-mc.exe" "C:\src\llvm.monorepo\llvm\test\MC\RISCV\rv64i-aliases-invalid.s" "-triple=riscv64" "-riscv-no-aliases"
$ "c:\src\llvm.monorepo\build.release2\bin\filecheck.exe" "C:\src\llvm.monorepo\llvm\test\MC\RISCV\rv64i-aliases-invalid.s"
C:\src\llvm.monorepo\llvm\test\MC\RISCV\rv64i-aliases-invalid.s:6:21: error: CHECK: expected string not found in input
li t4, foo # CHECK: :[[@LINE]]:8: error: operand must be a constant 64-bit integer
^
<stdin>:2:1: note: scanning from here
li t5, 0x10000000000000000 # CHECK: :[[@LINE]]:8: error: unknown operand
^
<stdin>:2:1: note: with "@LINE" equal to "6"
li t5, 0x10000000000000000 # CHECK: :[[@LINE]]:8: error: unknown operand
^
<stdin>:13:67: note: possible intended match here
C:\src\llvm.monorepo\llvm\test\MC\RISCV\rv64i-aliases-invalid.s:12:13: error: immediate must be an integer in the range [0, 63]
^
error: command failed with exit status: 1
------------------------------------------------------------------------
Currently, the CO-RE offset relocation does not work
if any struct/union member or array element is a typedef.
For example,
typedef const int arr_t[7];
struct input {
arr_t a;
};
func(...) {
struct input *in = ...;
... __builtin_preserve_access_index(&in->a[1]) ...
}
The BPF backend calculated default offset is 0 while
4 is the correct answer. Similar issues exist for struct/union
typedef's.
When getting struct/union member or array element type,
we should trace down to the type by skipping typedef
and qualifiers const/volatile as this is what clang did
to generate getelementptr instructions.
(const/volatile member type qualifiers are already
ignored by clang.)
This patch fixed this issue, for each access index,
skipping typedef and const/volatile/restrict BTF types.
Signed-off-by: Yonghong Song <yhs@fb.com>
Differential Revision: https://reviews.llvm.org/D65259
------------------------------------------------------------------------
Currently, we expect the CO-RE offset relocation records
a string encoding the original getelementptr access index,
so kernel bpf loader can decode it correctly.
For example,
struct s { int a; int b; };
struct t { int c; int d; };
#define _(x) (__builtin_preserve_access_index(x))
int get_value(const void *addr1, const void *addr2);
int test(struct s *arg1, struct t *arg2) {
return get_value(_(&arg1->b), _(&arg2->d));
}
We expect two offset relocations:
reloc 1: type s, access index 0, 1
reloc 2: type t, access index 0, 1
Two globals are created to retain access indexes for the
above two relocations with global variable names.
The first global has a name "0:1:". Unfortunately,
the second global has the name "0:1:.1" as the llvm
internals automatically add suffix ".1" to a global
with the same name. Later on, the BPF peels the last
character and record "0:1" and "0:1:." in the
relocation table.
This is not desirable. BPF backend could use the global
variable suffix knowledge to generate correct access str.
This patch rather took an approach not relying on
that knowledge. It generates "s:0:1:" and "t:0:1:" to
avoid global variable suffixes and later on generate
correct index access string "0:1" for both records.
Signed-off-by: Yonghong Song <yhs@fb.com>
Differential Revision: https://reviews.llvm.org/D65258
------------------------------------------------------------------------
Summary: The test case added in D62718 did not work unless the user was root because write bits were not set for the output file. This change uses only permissions with user write (0200) to ensure tests pass regardless of the users permissions.
[MachineCSE][MachinePRE] Avoid hoisting code from code regions into hot BBs.
Summary:
Current PRE hoists common computations into
CMBB = DT->findNearestCommonDominator(MBB, MBB1).
However, if CMBB is in a hot loop body, we might get performance
degradation.
[x86] try harder to form LEA from ADD to avoid flag conflicts (PR40483)
LEA doesn't affect flags, so use it more liberally to replace an ADD when
we know that the ADD operands affect flags.
In the motivating example from PR40483:
https://bugs.llvm.org/show_bug.cgi?id=40483
...this lets us avoid duplicating a math op just to avoid flag conflict.
As mentioned in the TODO comments, this heuristic can be extended to
fire more often if that leads to more improvements.
[X86] Disable combineConcatVectors for vXi1 vectors.
I'm not convinced the code this calls is properly vetted for
vXi1 vectors. Experimental vector widening legalization testing
for D55251 is now hitting an assertion failure inside
EltsFromConsecutiveLoads. This is occurring from a v2i1 load
having a store size different than its VT size. Hopefully
this commit will keep such issues from happening.
Alex Bradbury [Thu, 18 Jul 2019 05:22:55 +0000 (05:22 +0000)]
[DWARF][RISCV] Add support for RISC-V relocations needed for debug info
When code relaxation is enabled many RISC-V fixups are not resolved but
instead relocations are emitted. This happens even for DWARF debug
sections. Therefore, to properly support the parsing of DWARF debug info
we need to be able to resolve RISC-V relocations. This patch adds:
* Support for RISC-V relocations in RelocationResolver
* DWARF support for two relocations per object file offset
* DWARF changes to support relocations in more DIE fields
The two relocations per offset change is needed because some RISC-V
relocations (used for label differences) come in pairs.
Relocations can also be emitted for DWARF fields where relocations were
not yet evaluated. Adding relocation support for some of these fields is
essencial. On the other hand, LLVM currently emits RISC-V relocations
for fixups that could be safely evaluated, since they can never be
affected by code relaxations. This patch also adds relocation support
for the fields affected by those extraneous relocations (the DWARF unit
entry Length, and the DWARF debug line entry TotalLength and
PrologueLength), for testing purposes.
Differential Revision: https://reviews.llvm.org/D62062
Patch by Luís Marques.
[llvm-bcanalyzer] Fixed error 'Expected<T> must be checked before access or destruction'
After rL365286 I had failing test:
LLVM :: tools/gold/X86/v1.12/thinlto_emit_linked_objects.ll
It was failing with the output:
$ llvm-bcanalyzer --dump llvm/test/tools/gold/X86/v1.12/Output/thinlto_emit_linked_objects.ll.tmp3.o.thinlto.bc
Expected<T> must be checked before access or destruction.
Unchecked Expected<T> contained error:
Unexpected end of file reading 0 of 0 bytesStack dump:
llvm-pdbdump: Fix several smaller issues with injected source compression handling
- getCompression() used to return a PDB_SourceCompression even though
the docs for IDiaInjectedSource are explicit about the return value
being compiler-dependent. Return an uint32_t instead, and make the
printing code handle unknown values better by printing "Unknown" and
the int value instead of not printing any compression.
- Print compressed contents as hex dump, not as string.
- Add compression type "DotNet", which is used (at least) by csc.exe,
the C# compiler. Also add a lengthy comment describing the stream
contents (derived from looking at the raw hex contents long enough
to see the GUIDs, which led me to the roslyn and mono implementations
for handling this).
- The native injected source dumper was dumping the contents of the
whole data stream -- but csc.exe writes a stream that's padded with
zero bytes to the next 512 boundary, and the dia api doesn't display
those padding bytes. So make NativeInjectedSource::getCode() do the
same thing.
This will let us instrument globals during initialization. This required
making the new PM pass a module pass, which should still provide access to
analyses via the ModuleAnalysisManager.
[PEI] Don't re-allocate a pre-allocated stack protector slot
The LocalStackSlotPass pre-allocates a stack protector and makes sure
that it comes before the local variables on the stack.
We need to make sure that later during PEI we don't re-allocate a new
stack protector slot. If that happens, the new stack protector slot will
end up being **after** the local variables that it should be protecting.
Therefore, we would have two slots assigned for two different stack
protectors, one at the top of the stack, and one at the bottom. Since
PEI will overwrite the assigned slot for the stack protector, the load
that is used to compare the value of the stack protector will use the
slot assigned by PEI, which is wrong.
For this, we need to check if the object is pre-allocated, and re-use
that pre-allocated slot.
Matt Arsenault [Wed, 17 Jul 2019 20:22:44 +0000 (20:22 +0000)]
GlobalISel: Handle widenScalar of arbitrary G_MERGE_VALUES sources
Extract the sources to the GCD of the original size and target size,
padding with implicit_def as necessary.
Also fix the case where the requested source type is wider than the
original result type. This was ignoring the type, and just using the
destination. Do the operation in the requested type and truncate back.
Matt Arsenault [Wed, 17 Jul 2019 20:22:38 +0000 (20:22 +0000)]
GlobalISel: Handle more cases for widenScalar of G_MERGE_VALUES
Use an anyext to the requested type for the leftover operand to
produce a slightly wider type, and then truncate the final merge.
I have another implementation almost ready which handles arbitrary
widens, but I think it produces worse code in this example (which I
think is 90% due to not folding redundant copies or folding out
implicit_def users), so I wanted to add this as a baseline first.
Lang Hames [Wed, 17 Jul 2019 16:40:52 +0000 (16:40 +0000)]
[ORC] Add deprecation warnings to ORCv1 layers and utilities.
Summary:
ORCv1 is deprecated. The current aim is to remove it before the LLVM 10.0
release. This patch adds deprecation attributes to the ORCv1 layers and
utilities to warn clients of the change.
These tests that failed on Darwin but passed on other machines due to the default archive format differing
on a Darwin machine, and what looks to be bugs in the output of this format.
I can not investigate these issue further so the tests are considered expected failures on Darwin.
Alex Bradbury [Wed, 17 Jul 2019 14:32:25 +0000 (14:32 +0000)]
[RISCV] Add RISCV to LLVM_ALL_TARGETS so it s built by default
This follows the RFC <http://lists.llvm.org/pipermail/llvm-dev/2019-July/133724.html>.
Follow-on commits will add appropriate release notes changes etc.
Pushing this now and in a minimal form so there is reasonable time before 9.0
branches to resolve any issues arising from e.g. the backend being exposed on
different sanitizer setups.
The current builder for RISC-V is on the staging build-bot
<http://lab.llvm.org:8014/builders/llvm-riscv-linux>, however with the RISCV
backend being built by default it won't provide any real additional coverage.
We will shortly set up a builder that runs the test-suite in qemu-user.
Alex Bradbury [Wed, 17 Jul 2019 14:00:35 +0000 (14:00 +0000)]
[AsmPrinter] Make the encoding of call sites in .gcc_except_table configurable and use for RISC-V
The original behavior was to always emit the offsets to each call site in the
call site table as uleb128 values, however on some architectures (eg RISCV)
these uleb128 offsets into the code cannot always be resolved until link time
(because relaxation will invalidate any calculated offsets), and there are no
appropriate relocations for uleb128 values. As a consequence it needs to be
possible to specify an alternative.
This also switches RISCV to use DW_EH_PE_udata4 for call side encodings in
.gcc_except_table
Differential Revision: https://reviews.llvm.org/D63415
Patch by Edward Jones.
Matt Arsenault [Wed, 17 Jul 2019 13:55:01 +0000 (13:55 +0000)]
Mips: Remove immarg from copy and insert intrinsics
These intrinsics do in fact work with non-constant index arguments.
These are lowered to either the generic
ISD::INSERT_VECTOR_ELT/ISD::EXTRACT_VECTOR_ELT, or to
VEXTRACT_SEXT_ELT. The handling of these all accept variable
indexes. Turning these into generic instructions which do allow
variables introduces complications in a future change to immarg
handling.
Since these just turn into generic instructions, these are kind of
pointless and should probably just be autoupgraded to
extractelement/insertelement.
Alex Bradbury [Wed, 17 Jul 2019 13:54:38 +0000 (13:54 +0000)]
[RISCV] Set correct encodings for DWARF exception handling
This patch sets correct encodings for DWARF exception handling for RISC-V
(other than call site encoding, which must be udata4 rather than uleb128 and
is handled by D63415).
This has the same intend as D63409, except this version matches GCC/binutils
behaviour which uses the same encodings regardless of PIC/non-PIC and
medlow/medany code model.