]> granicus.if.org Git - llvm/commitdiff
[CUDA] Implemented _[bi]mma* builtins.
authorArtem Belevich <tra@google.com>
Thu, 25 Apr 2019 22:28:09 +0000 (22:28 +0000)
committerArtem Belevich <tra@google.com>
Thu, 25 Apr 2019 22:28:09 +0000 (22:28 +0000)
These builtins provide access to the new integer and
sub-integer variants of MMA (matrix multiply-accumulate) instructions
provided by CUDA-10.x on sm_75 (AKA Turing) GPUs.

Also added a feature for PTX 6.4. While Clang/LLVM does not generate
any PTX instructions that need it, we still need to pass it through to
ptxas in order to be able to compile code that uses the new 'mma'
instruction as inline assembly (e.g used by NVIDIA's CUTLASS library
https://github.com/NVIDIA/cutlass/blob/master/cutlass/arch/mma.h#L101)

Differential Revision: https://reviews.llvm.org/D60279

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359248 91177308-0d34-0410-b5e6-96231b3b80d8

lib/Target/NVPTX/NVPTX.td

index 9f5048b3182ce3c6ef9e34429b8c22bb6f8715b6..1d947ef1ce6231c46c5622d42f8651378dbf1d38 100644 (file)
@@ -75,6 +75,8 @@ def PTX61 : SubtargetFeature<"ptx61", "PTXVersion", "61",
                              "Use PTX version 6.1">;
 def PTX63 : SubtargetFeature<"ptx63", "PTXVersion", "63",
                              "Use PTX version 6.3">;
+def PTX64 : SubtargetFeature<"ptx64", "PTXVersion", "64",
+                             "Use PTX version 6.4">;
 
 //===----------------------------------------------------------------------===//
 // NVPTX supported processors.