]> granicus.if.org Git - llvm/commit
[AMDGPU][CodeGen] To improve CGEMM performance: combine LDS reads.
authorAlexander Timofeev <Alexander.Timofeev@amd.com>
Thu, 3 Nov 2016 14:37:13 +0000 (14:37 +0000)
committerAlexander Timofeev <Alexander.Timofeev@amd.com>
Thu, 3 Nov 2016 14:37:13 +0000 (14:37 +0000)
commitd767830f39e89fbd53366b4c63bd1715e0f365b2
treeae5dd118e7f76602b173e24c63590293ee39bb65
parentd55aa1c80c2bb6099fc25005e465c3fd6997e4ea
[AMDGPU][CodeGen] To improve CGEMM performance: combine LDS reads.

hange explores the fact that LDS reads may be reordered even if access
the same location.

Prior the change, algorithm immediately stops as soon as any memory
access encountered between loads that are expected to be merged
together. Although, Read-After-Read conflict cannot affect execution
correctness.

Improves hcBLAS CGEMM manually loop-unrolled kernels performance by 44%.
Also improvement expected on any massive sequences of reads from LDS.

Differential Revision: https://reviews.llvm.org/D25944

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285919 91177308-0d34-0410-b5e6-96231b3b80d8
lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
test/CodeGen/AMDGPU/ds_read2.ll