Simon Pilgrim [Thu, 7 Dec 2017 15:24:14 +0000 (15:24 +0000)]
[X86] Tag LZCNT/TZCNT instructions scheduler classes
Tagged as IMUL instructions for a reasonable approximation (ALU tends to be a lot faster) - POPCNT is currently tagged as FAdd which I think should be replaced with IMUL as well
Sanjay Patel [Thu, 7 Dec 2017 15:17:58 +0000 (15:17 +0000)]
[DAGCombiner] eliminate shuffle of insert element
I noticed this pattern in D38316 / D38388. We failed to combine a shuffle that is either
repeating a scalar insertion at the same position in a vector or translated to a different
element index.
Like the earlier patch, this could be an instcombine too, but since we opted to make this
a DAG transform earlier, I've made this one a DAG patch too.
We do not need any legality checking because the new insert is identical to the existing
insert except that it may have a different constant insertion operand.
The constant insertion test in test/CodeGen/X86/vector-shuffle-combining.ll was the
motivation for D38756.
Dan Gohman [Thu, 7 Dec 2017 13:49:27 +0000 (13:49 +0000)]
[WebAssemby] Support main functions with alternate signatures.
WebAssembly requires caller and callee signatures to match, so the usual
C runtime trick of calling main and having it just work regardless of
whether main is defined as '()' or '(int argc, char *argv[])' doesn't
work. Extend the FixFunctionBitcasts pass to rewrite main to use the
latter form.
[Nios2] final infrastructure to provide compilation of a return from a function
This patch includes all missing functionality needed to provide first
compilation of a simple program that just returns from a function.
I've added a test case that checks for "ret" instruction printed in assembly
output.
Patch by Andrei Grischenko (andrei.l.grischenko@intel.com)
Differential revision: https://reviews.llvm.org/D39688
[dsymutil] Add -verify option to run DWARF verifier after linking.
This patch adds support for running the DWARF verifier on the linked
debug info files. If the -verify options is specified and verification
fails, dsymutil exists with abort with non-zero exit code. This behavior
is *not* enabled by default.
Pavel Labath [Thu, 7 Dec 2017 10:54:23 +0000 (10:54 +0000)]
[Testing/Support] Make matchers work with Expected<T&>
Summary:
This did not work because the ExpectedHolder was trying to hold the
value in an Optional<T*>. Instead of trying to mimic the behavior of
Expected and try to make ExpectedHolder work with references and
non-references, I simply store the reference to the Expected object in
the holder.
I also add a bunch of tests for these matchers, which have helped me
flesh out some problems in my initial implementation of this patch, and
uncovered the fact that we are not consistent in quoting our values in
the matcher output (which I also fix).
Alex Bradbury [Thu, 7 Dec 2017 10:46:23 +0000 (10:46 +0000)]
[RISCV] MC layer support for the standard RV32D instruction set extension
As the FPR32 and FPR64 registers have the same names, use
validateTargetOperandClass in RISCVAsmParser to coerce a parsed FPR32 to an
FPR64 when necessary. The rest of this patch is very similar to the RV32F
patch.
[CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register.
Work towards the unification of MIR and debug output by refactoring the
interfaces.
For MachineOperand::print, keep a simple version that can be easily called
from `dump()`, and a more complex one which will be called from both the
MIRPrinter and MachineInstr::print.
Add extra checks inside MachineOperand for detached operands (operands
with getParent() == nullptr).
Alex Bradbury [Thu, 7 Dec 2017 10:26:05 +0000 (10:26 +0000)]
[RISCV] MC layer support for the standard RV32F instruction set extension
The most interesting part of this patch is probably the handling of
rounding mode arguments. Sadly, the RISC-V assembler handles floating point
rounding modes as a special "argument" when it would be more consistent to
handle them like the atomics, opcode suffixes. This patch supports parsing
this optional parameter, using InstAlias to allow parsing these floating point
instructions when no rounding mode is specified.
Alex Bradbury [Thu, 7 Dec 2017 09:51:55 +0000 (09:51 +0000)]
[TableGen] Give the option of tolerating duplicate register names
A number of architectures re-use the same register names (e.g. for both 32-bit
FPRs and 64-bit FPRs). They are currently unable to use the tablegen'erated
MatchRegisterName and MatchRegisterAltName, as tablegen (when built with
asserts enabled) will fail.
When the AllowDuplicateRegisterNames in AsmParser is set, duplicated register
names will be tolerated. A backend can then coerce registers to the desired
register class by (for instance) implementing validateTargetOperandClass.
At least the in-tree Sparc backend could benefit from this, as does RISC-V
(single and double precision floating point registers).
Gadi Haber [Thu, 7 Dec 2017 09:16:34 +0000 (09:16 +0000)]
[X86][FMA][FMA4]: Adding full coverage of MC encoding for the FMA, FMA4 isa sets.<NFC>
NFC.
Adding MC regressions tests to cover the FMA and FMA4 ISA sets.
This patch is part of a larger task to cover MC encoding of all X86 ISA Sets starting revision https://reviews.llvm.org/D39952
Gadi Haber [Thu, 7 Dec 2017 09:00:19 +0000 (09:00 +0000)]
[X86][X87]: Adding full coverage of MC encoding for all X87 ISA Sets.<NFC>
NFC.
Currently, not all the X86 ISA Sets are covered by the MC regressions tests for X86.
A full coverage needs to be added for each ISA set and for both 32bit and 64bit instructions + registers.
This patch includes MC assembly tests for the X87 32bit and 64bit.
Sam Clegg [Thu, 7 Dec 2017 02:55:51 +0000 (02:55 +0000)]
[WebAssembly] section kind can be code
Currently, when creating a named section, the Wasm
frontend forces it to use `SectionKind::Data`, whereas
in fact C++ does generate code sections with custom
names.
Dan Gohman [Wed, 6 Dec 2017 23:57:11 +0000 (23:57 +0000)]
[WebAssembly] Import the linear memory and function table.
Instead of having .o files contain linear-memory and function table
definitions, use imports. This is more consistent with the stack pointer
being imported, and it's consistent with the linker being the one to
decide whether linear memory and function table are imported or defined
in the linked output. This implements tool-conventions #23.
Florian Hahn [Wed, 6 Dec 2017 22:48:36 +0000 (22:48 +0000)]
[AArch64] Add patterns to replace fsub fmul with fma fneg.
Summary:
This patch adds MachineCombiner patterns for transforming
(fsub (fmul x y) z) into (fma x y (fneg z)). This has a lower
latency on micro architectures where fneg is cheap.
Matthew Simpson [Wed, 6 Dec 2017 21:22:54 +0000 (21:22 +0000)]
[PGO] Make indirect call promotion a utility
This patch factors out the main code transformation utilities in the pgo-driven
indirect call promotion pass and places them in Transforms/Utils. The change is
intended to be a non-functional change, letting non-pgo-driven passes share a
common implementation with the existing pgo-driven pass.
The common utilities are used to conditionally promote indirect call sites to
direct call sites. They perform the underlying transformation, and do not
consider profile information. The pgo-specific details (e.g., the computation
of branch weight metadata) have been left in the indirect call promotion pass.
Florian Hahn [Wed, 6 Dec 2017 20:27:33 +0000 (20:27 +0000)]
[MachineCombiner] Add up latencies of all instructions in new pattern.
Summary:
When calculating the RootLatency, we add up all the latencies of the
deleted instructions. But for NewRootLatency we only add the latency of
the new root instructions, ignoring the latencies of the other
instructions inserted. This leads the combiner to underestimate the cost
of patterns which add multiple instructions. This patch fixes that by
summing up the latencies of all new instructions. For NewRootNode, the
more complex getLatency function is used.
Note that we may be slightly more precise than just summing up
all latencies. For example, consider a pattern like
r1 = INS1 ..
r2 = INS2 ..
r3 = INS3 r1, r2
I think in some other places, the total latency of the pattern would be
estimated as lat(INS3) + max(lat(INS1), lat(INS2)). If you consider
that worth changing, I think it would be best to do in a follow-up
patch.
Alina Sbirlea [Wed, 6 Dec 2017 19:56:37 +0000 (19:56 +0000)]
[ModRefInfo] Do not use ModRefInfo result in if conditions as this makes
assumptions about the values in the enum. Replace with wrapper returning
bool [NFC].
Rui Ueyama [Wed, 6 Dec 2017 19:18:24 +0000 (19:18 +0000)]
[COFF] Ignore semicolons in module definition identifiers
Patch by David Major.
The NSS project's .def files make heavy use of semicolons in a
frightening attempt at portability:
https://hg.mozilla.org/projects/nss/raw-file/tip/lib/ckfw/capi/nsscapi.def
lld-link was treating the semicolon as part of the export name,
resulting in unresolved symbols. This patch includes ';' in the list of
characters to split on.
Craig Topper [Wed, 6 Dec 2017 18:40:46 +0000 (18:40 +0000)]
[X86] Simplify the TTI code for getInterleavedMemoryOpCost around for AVX512BW. NFCI
Previously the lambda for AVX512 passed out a flag that indicated whether AVX512BW was required and that was checked against the AVX512BW subtarget flag outside.
This patch changes the interface to pass the AVX512BW subtarget bit in and return its value if we detect 16 or 8 bit types.
Shoaib Meenai [Wed, 6 Dec 2017 18:33:07 +0000 (18:33 +0000)]
[cmake] Remove unnecessary header include in atomics check
The header include was required to work around PR19898, as noted in that
comment. That PR has since been marked resolved fixed, and the
configuration check passes without the header inclusion both when
compiling on Windows with cl and when cross-compiling on Linux using
clang-cl.
I noticed this because the inclusion was cased incorrectly (Intrin.h
instead of intrin.h), which when cross-compiling on a case sensitive
file system would cause the intrin.h from the Windows SDK to be included
(which LLVM can't handle) instead of the one from clang's resource
directory, making the check fail. This is the same issue as r309980.
Correcting the case of the inclusion makes the check pass when cross
compiling, but it seems better to get rid of the inclusion entirely,
since it appears to be unnecessary now.
Adam Nemet [Wed, 6 Dec 2017 16:50:50 +0000 (16:50 +0000)]
[opt-viewer] Suppress noisy Swift remarks
Most likely, this is not how we want to handle this in the long term. This
code should probably be in the Swift repo and somehow plugged into the
opt-viewer. This is still however very experimental at this point so I don't
want to over-engineer it at this point.
Nirav Dave [Wed, 6 Dec 2017 15:30:13 +0000 (15:30 +0000)]
[ARM][AArch64][DAG] Reenable post-legalize store merge
Reenable post-legalize stores with constant merging computation and
corresponding test case.
* Properly truncate store merge constants
* Disable merging of truncated stores floating points
* Ensure merges of constant stores into a single vector are
constructed from legal elements.