Summary: When device offloading is enabled and the device is an NVIDIA GPU, OpenMP target regions must be compiled with relocation enabled by passing the "-c" flag to the PTXAS invocation.
Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, Hahnfeld, jlebar, hfinkel, tstellar
Reviewed By: Hahnfeld
Subscribers: Hahnfeld, rengolin, mkuron, cfe-commits
Differential Revision: https://reviews.llvm.org/D29642
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@310300
91177308-0d34-0410-b5e6-
96231b3b80d8
for (const auto& A : Args.getAllArgValues(options::OPT_Xcuda_ptxas))
CmdArgs.push_back(Args.MakeArgString(A));
+ // In OpenMP we need to generate relocatable code.
+ if (JA.isOffloading(Action::OFK_OpenMP))
+ CmdArgs.push_back("-c");
+
const char *Exec;
if (Arg *A = Args.getLastArg(options::OPT_ptxas_path_EQ))
Exec = A->getValue();
// CHK-TWOCUBIN: clang-offload-bundler" "-type=o" "{{.*}}inputs={{.*}}tmp1.o" "-outputs={{.*}}.o,{{.*}}tmp1-openmp-nvptx64-nvidia-cuda.cubin" "-unbundle"
// CHK-TWOCUBIN-NEXT: clang-offload-bundler" "-type=o" "{{.*}}inputs={{.*}}tmp2.o" "-outputs={{.*}}.o,{{.*}}tmp2-openmp-nvptx64-nvidia-cuda.cubin" "-unbundle"
// CHK-TWOCUBIN-NEXT: nvlink" "-o" "{{.*}}-openmp-nvptx64-nvidia-cuda" {{.*}} "openmp-offload.c.tmp1-openmp-nvptx64-nvidia-cuda.cubin" "openmp-offload.c.tmp2-openmp-nvptx64-nvidia-cuda.cubin"
+
+/// ###########################################################################
+
+/// Check PTXAS is passed -c flag when offloading to an NVIDIA device using OpenMP.
+// RUN: %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -no-canonical-prefixes %s 2>&1 \
+// RUN: | FileCheck -check-prefix=CHK-PTXAS-DEFAULT %s
+
+// CHK-PTXAS-DEFAULT: ptxas{{.*}}" "-c"