Summary:
Before, they were one category of operands which could cause
crashes in non-sensical combinations, e.g. "f32.const symbol".
Now these are forced to be an error.
George Rimar [Mon, 8 Jul 2019 16:53:39 +0000 (16:53 +0000)]
[llvm\test\Object] - An initial step to cleanup the test cases.
This patch removes trivial-object-test.elf-i386,
trivial-object-test.elf-x86-64 and trivial-object-test2.elf-x86-64
precompiled objects from test/Object/Inputs folder.
I adjusted the existent test cases to use YAML instead.
[InstCombine] canonicalize insert+splat to/from element 0 of vector
We recognize a splat from element 0 in (VectorUtils) llvm::getSplatValue()
and also in ShuffleVectorInst::isZeroEltSplatMask(), so this converts
to that form for better matching.
The backend generically turns these patterns into build_vector,
so there should be no codegen difference.
Kevin P. Neal [Mon, 8 Jul 2019 16:18:18 +0000 (16:18 +0000)]
Teach the IRBuilder about fadd and friends.
The IRBuilder has calls to create floating point instructions like fadd.
It does not have calls to create constrained versions of them. This patch
adds support for constrained creation of fadd, fsub, fmul, fdiv, and frem.
Reviewed by: John McCall, Sanjay Patel
Approved by: John McCall
Differential Revision: https://reviews.llvm.org/D53157
Brian Homerding [Mon, 8 Jul 2019 15:57:56 +0000 (15:57 +0000)]
Add, and infer, a nofree function attribute
This patch adds a function attribute, nofree, to indicate that a function does
not, directly or indirectly, call a memory-deallocation function (e.g., free,
C++'s operator delete).
Alex Bradbury [Mon, 8 Jul 2019 14:52:36 +0000 (14:52 +0000)]
[Triple] Add isRISCV function
This matches isARM, isThumb, isAArch64 and similar helpers. Future commits
which clean-up code that currently checks for Triple::riscv32 ||
Triple::riscv64.
Differential Revision: https://reviews.llvm.org/D54215
Patch by Simon Cook.
Test case added by Alex Bradbury.
Petar Avramovic [Mon, 8 Jul 2019 14:45:52 +0000 (14:45 +0000)]
[MIPS GlobalISel] Register bank select for G_LOAD. Select i64 load
Select gprb or fprb when loaded value is used by either:
copy to physical register or
instruction with only one mapping available for that use operand.
Load of integer s64 is handled with narrowScalar when mapping is applied,
produced artifacts are combined away. Manually set gprb to all register
operands of instructions created during narrowScalar.
Petar Avramovic [Mon, 8 Jul 2019 14:36:36 +0000 (14:36 +0000)]
[MIPS GlobalISel] Register bank select for G_STORE. Select i64 store
Select gprb or fprb when stored value is defined by either:
copy from physical register or
instruction with only one mapping available for that def operand.
Store of integer s64 is handled with narrowScalar when mapping is applied,
produced artifacts are combined away. Manually set gprb to all register
operands of instructions created during narrowScalar.
[AMDGPU][MC] Corrected parsing of FLAT offset modifier
Summary of changes:
- simplified handling of FLAT offset: offset_s13 and offset_u12 have been replaced with flat_offset;
- provided information about error position for pre-gfx9 targets;
- improved errors handling.
[ARM] Relax constraints on operands of VQxDMLxDH instructions
Summary:
According to a recently updated Armv8-M spec
(https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf) the
32-bit width versions of the following instructions:
* VQDMLADH
* VQDMLADHX
* VQRDMLADH
* VQRDMLADHX
* VQDMLSDH
* VQDMLSDHX
* VQRDMLSDH
* VQRDMLSDHX
are no longer unpredictable when their output register is the same as
one of the input registers.
This patch updates the assembler parser and the corresponding tests
and also removes @earlyclobber from the instruction constraints.
Alex Bradbury [Mon, 8 Jul 2019 09:16:47 +0000 (09:16 +0000)]
[RISCV] Specify registers used in DWARF exception handling
Defines RISCV registers for getExceptionPointerRegister() and
getExceptionSelectorRegister().
Differential Revision: https://reviews.llvm.org/D63411
Patch by Edward Jones.
Modified by Alex Bradbury to add CHECK lines to exception-pointer-register.ll.
Alex Bradbury [Mon, 8 Jul 2019 08:34:16 +0000 (08:34 +0000)]
[UpdateTestChecks] Skip over .Lfunc_begin for RISC-V
This mirrors the change made for X86 in rL336987. Without this patch,
update_llc_test_checks will completely skip functions with personality
functions.
[llvm-bcanalyzer] Refactor and move to libLLVMBitReader
This allows us to use the analyzer from unit tests.
* Refactor the interface to use proper error handling for most functions
after JF's work.
* Move everything into a BitstreamAnalyzer class.
* Move that to Bitcode/BitcodeAnalyzer.h.
David Majnemer [Sun, 7 Jul 2019 04:47:37 +0000 (04:47 +0000)]
[CodeGen] Add larger vector types for i32 and f32
Some out of tree backend require larger vector type. Since maintaining the changes out of tree is difficult due to the many manual changes needed when adding a new type we are adding it even if no backend currently use it.
[X86] Remove patterns from MOVLPSmr and MOVHPSmr instructions.
These patterns are the same as the MOVLPDmr and MOVHPDmr patterns,
but with a bitcast at the end. We can just select the PD instruction
and let execution domain fixing switch to PS.
[X86] Add patterns to select MOVLPDrm from MOVSD+load and MOVHPD from UNPCKL+load.
These narrow the load so we can only do it if the load isn't
volatile.
There also tests in vector-shuffle-128-v4.ll that this should
support, but we don't seem to fold bitcast+load on pre-sse4.2
targets due to the slow unaligned mem 16 flag.
Philip Reames [Sat, 6 Jul 2019 04:28:00 +0000 (04:28 +0000)]
[IRBuilder] Fold consistently for or/and whether constant is LHS or RHS
Without this, we have the unfortunate property that tests are dependent on the order of operads passed the CreateOr and CreateAnd functions. In actual usage, we'd promptly optimize them away, but it made tests slightly more verbose than they should have been.
Philip Reames [Sat, 6 Jul 2019 03:46:18 +0000 (03:46 +0000)]
[IRBuilder] Introduce helpers for and/or of multiple values at once
We had versions of this code scattered around, so consolidate into one location.
Not strictly NFC since the order of intermediate results may change in some places, but since these operations are associatives, should not change results.
Although removeCopyByCommutingDef deals with full copies, it is still
possible to copy undef lanes and thus, we wouldn't have any a value
number for these lanes.
Matt Arsenault [Fri, 5 Jul 2019 23:33:43 +0000 (23:33 +0000)]
RegUsageInfoCollector: Skip AMDGPU entry point functions
I'm not sure if it's worth it or not to add a hook to disable the pass
for an arbitrary function.
This pass is taking up to 5% of compile time in tiny programs by
iterating through all of the physical registers in every register
class. This pass should be rewritten in terms of regunits. For now,
skip doing anything for entry point functions. The vast majority of
functions in the real world aren't callable, so just not running this
will give the majority of the benefit.
Summary:
This patch simplifies 2 aspects in the FileCheckNumericVariable code.
First, setValue() method is turned into a void function since being
called only on undefined variable is an invariant and is now asserted
rather than returned. This remove the assert from the callers.
Second, clearValue() method is also turned into a void function since
the only caller does not check its return value since it may be trying
to clear the value of variable that is already cleared without this
being noteworthy.
Only custom lower uaddo+addcarry or usubo+subcarry chains and leave
mixtures like usubo+addcarry or uaddo+subcarry to the generic
legalizer. Otherwise we run into issues because SystemZ uses
different CC values for carries and borrows.
Matt Arsenault [Fri, 5 Jul 2019 20:26:13 +0000 (20:26 +0000)]
AMDGPU: Make AMDGPUPerfHintAnalysis an SCC pass
Add a string attribute instead of directly setting
MachineFunctionInfo. This avoids trying to get the analysis in the
MachineFunctionInfo in a way that doesn't work with the new pass
manager.
This will also avoid re-visiting the call graph for every single
function.
[X86] Correct the size check in foldMemoryOperandCustom.
The Size either needs to be 0 meaning we aren't folding
a stack reload. Or the stack slot needs to be at least
16 bytes. I've also added a paranoia check ensure the
RCSize is at leat 16 bytes as well. This avoids any
FR32/FR64 surprises, but I think we already filtered
those earlier.
All of our test case have Size as either 0 or 16 and
RCSize == 16. So the Size <= 16 check worked for those
cases.
[PowerPC] Move TOC save to prologue when profitable
The indirect call sequence on PPC requires that the TOC base register be saved
prior to the indirect call and restored after the call since the indirect call
may branch to a global entry point in another DSO which will update the TOC
base. Over the last couple of years, we have improved this to:
- be able to hoist TOC saves from loops (with changes to MachineLICM)
- avoid multiple saves when one dominates the other[s]
However, it is still possible to have multiple TOC saves dynamically in the
execution path if there is no dominance relationship between them.
This patch moves the TOC save to the prologue when one of the TOC saves is in a
block that post-dominates entry (i.e. it cannot be avoided) or if it is in a
block that is hotter than entry.
[X86] Remove unnecessary isel pattern for MOVLPSmr.
This was identical to a pattern for MOVPQI2QImr with a bitcast
as an input. But we should be able to turn MOVPQI2QImr into
MOVLPSmr in the execution domain fixup pass so we shouldn't
need this.
James Henderson [Fri, 5 Jul 2019 16:38:52 +0000 (16:38 +0000)]
[docs][llvm-readobj] Add a note to options that do nothing in GNU output
--section-data, --section-relocations and --section-symbols have no
effect for GNU style ouput. This patch changes the docs to point this
out, as it has caught me out on a couple of occasions.
See also https://bugs.llvm.org/show_bug.cgi?id=42522.
Summary:
This patch changes expression support to use one instance of
FileCheckNumericVariable per numeric variable rather than one per
variable and per definition. The current system was only necessary for
the last patch of the numeric expression support patch series in order
to handle a line using a variable defined earlier on the same line from
the input text. However this can be dealt more efficiently.
[FileCheck] Don't diagnose undef vars at parse time
Summary:
Diagnosing use of undefined variables takes place in
parseNumericVariableUse() and printSubstitutions() for numeric variables
but only takes place in printSubstitutions() for string variables. The
reason for the split location of diagnostics is that parsing is not
aware of the clearing of variables due to --enable-var-scope and thus
use of variables cleared in this way can only be catched by
printSubstitutions().
Beyond the code level inconsistency, there is also a user facing
inconsistency since diagnostics look different between the two
functions. While the diagnostic in printSubstitutions is more verbose,
doing the diagnostic there allows to diagnose all undefined variables
rather than just the first one and error out.
This patch create dummy variable definition when encountering a use of
undefined variable so that parsing can proceed and be diagnosed by
printSubstitutions() later. Tests that were testing whether parsing
fails in such case are thus modified accordingly.
Matt Arsenault [Fri, 5 Jul 2019 15:32:28 +0000 (15:32 +0000)]
ScheduleDAG: Fix incorrectly killing registers in bundles
When looking for uses/defs to add kill flags, the iterator was double
incremented, skipping the first instruction in the bundle. The use
register in the first bundle instruction was then incorrectly killed.
The "First" instruction should be the BUNDLE itself as the proper
reverse iterator endpoint.
Jay Foad [Fri, 5 Jul 2019 14:52:48 +0000 (14:52 +0000)]
[AMDGPU] DPP combiner: recognize identities for more opcodes
Summary:
This allows the DPP combiner to kick in more often. For example the
exclusive scan generated by the atomic optimizer for a divergent atomic
add used to look like this:
Graham Hunter [Fri, 5 Jul 2019 12:48:16 +0000 (12:48 +0000)]
Scalable Vector IR Type with further LTO fixes
Reintroduces the scalable vector IR type from D32530, after it was reverted
a couple of times due to increasing chromium LTO build times. This latest
incarnation removes the walk over aggregate types from the verifier entirely,
in favor of rejecting scalable vectors in the isValidElementType methods in
ArrayType and StructType. This removes the 70% degradation observed with
the second repro tarball from PR42210.
Robert Lougher [Fri, 5 Jul 2019 12:42:06 +0000 (12:42 +0000)]
This reverts r365061 and r365062 (test update)
Revision r365061 changed a skip of debug instructions for a skip
of meta instructions. This is not safe, as IMPLICIT_DEF is classed
as a meta instruction.
Sam Elliott [Fri, 5 Jul 2019 12:35:21 +0000 (12:35 +0000)]
[RISCV] Support @llvm.readcyclecounter() Intrinsic
On RISC-V, the `cycle` CSR holds a 64-bit count of the number of clock
cycles executed by the core, from an arbitrary point in the past. This
matches the intended semantics of `@llvm.readcyclecounter()`, which we
currently leave to the default lowering (to the constant 0).
With this patch, we will now correctly lower this intrinsic to the
intended semantics, using the user-space instruction `rdcycle`. On
64-bit targets, we can directly lower to this instruction.
On 32-bit targets, we need to do more, as `rdcycle` only returns the low
32-bits of the `cycle` CSR. In this case, we perform a custom lowering,
based on the PowerPC lowering, using `rdcycleh` to obtain the high
32-bits of the `cycle` CSR. This custom lowering inserts a new basic
block which detects overflow in the high 32-bits of the `cycle` CSR
during reading (because multiple instructions are required to read). The
emitted assembly matches the suggested assembly in the RISC-V
specification.
lld, llvm-dlltool, llvm-lib: Use getAsString() instead of getSpelling() for printing unknown args
Since OPT_UNKNOWN args never have any values and consist only of
spelling (and are never aliased), this doesn't make any difference in
practice, but it's more consistent with Arg's guidance to use
getAsString() for diagnostics, and it matches what clang does.
Also tweak two tests to use an unknown option that contains '=' for
additional coverage while here. (The new tests pass fine with the old
code too though.)
Robert Lougher [Fri, 5 Jul 2019 12:20:21 +0000 (12:20 +0000)]
This reverts r365061 and r365062 (test update)
Revision r365061 changed a skip of debug instructions for a skip
of meta instructions. This is not safe, as IMPLICIT_DEF is classed
as a meta instruction.
[FileCheck] Fix comment in parseNumericVariableUse
Summary:
Comment explaining the interaction between parsing of numeric variable
definition and uses in parseNumericVariableUse is stale since it
suggests both use and definition parsing is done in the same function.
This was the case in a previous version of the patch committed as 71d3f227a790d6cf39d8c6267940e0dc0c237e11 but is no longer the case. This
patch updates the comment accordingly.
Summary:
Both callers of parseNumericVariableDefinition() perform the same extra
check that no character is found after the variable name. This patch
factors out this check into parseNumericVariableDefinition().