From: Alexey Bataev Date: Tue, 4 Dec 2018 15:25:01 +0000 (+0000) Subject: [OPENMP][NVPTX]Fixed emission of the critical region. X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=f62408a8ff7654fc0b72e192edb8529e0d07787d;p=clang [OPENMP][NVPTX]Fixed emission of the critical region. Critical regions in NVPTX are the constructs, which, generally speaking, are not supported by the NVPTX target. Instead we're using special technique to handle the critical regions. Currently they are supported only within the loop and all the threads in the loop must execute the same critical region. Inside of this special regions the regions still must be emitted as critical, to avoid possible data races between the teams + synchronization must use __kmpc_barrier functions. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@348272 91177308-0d34-0410-b5e6-96231b3b80d8 --- diff --git a/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp b/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp index cf814a4b20..5f7122e4e8 100644 --- a/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp +++ b/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp @@ -2743,14 +2743,16 @@ void CGOpenMPRuntimeNVPTX::emitCriticalRegion( CGF.EmitBlock(BodyBB); // Output the critical statement. - CriticalOpGen(CGF); + CGOpenMPRuntime::emitCriticalRegion(CGF, CriticalName, CriticalOpGen, Loc, + Hint); // After the body surrounded by the critical region, the single executing // thread will jump to the synchronisation point. // Block waits for all threads in current team to finish then increments the // counter variable and returns to the loop. CGF.EmitBlock(SyncBB); - getNVPTXCTABarrier(CGF); + emitBarrierCall(CGF, Loc, OMPD_unknown, /*EmitChecks=*/false, + /*ForceSimpleCall=*/true); llvm::Value *IncCounterVal = CGF.Builder.CreateNSWAdd(CounterVal, CGF.Builder.getInt32(1)); diff --git a/test/OpenMP/nvptx_parallel_codegen.cpp b/test/OpenMP/nvptx_parallel_codegen.cpp index 08431fccc0..3dcf330179 100644 --- a/test/OpenMP/nvptx_parallel_codegen.cpp +++ b/test/OpenMP/nvptx_parallel_codegen.cpp @@ -356,7 +356,13 @@ int bar(int n){ // CHECK: [[RES:%.+]] = icmp eq i32 [[TID]], [[CC_VAL]] // CHECK: br i1 [[RES]], label -// CHECK: call void @llvm.nvvm.barrier0() +// CHECK: call void @__kmpc_critical( +// CHECK: load i32, i32* +// CHECK: add nsw i32 +// CHECK: store i32 +// CHECK: call void @__kmpc_end_critical( + +// CHECK: call void @__kmpc_barrier(%struct.ident_t* @{{.+}}, i32 %{{.+}}) // CHECK: [[NEW_CC_VAL:%.+]] = add nsw i32 [[CC_VAL]], 1 // CHECK: store i32 [[NEW_CC_VAL]], i32* [[CC]], // CHECK: br label