[PATCH] D83862: [AMDGPU] Add missing test prefixes
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jul 17 09:17:46 PDT 2020
rampitec added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/perfhint.ll:33
; GCN-LABEL: {{^}}test_large_stride:
-; MemoryBound: 0
-; WaveLimiterHint : 1
+; GCN: MemoryBound: 0
+; GCN: WaveLimiterHint : 1
----------------
foad wrote:
> rampitec wrote:
> > foad wrote:
> > > This check fails.
> > This one is memory bound, there are practically only memory operations here. I think it needs some ALU in between to catch large stride only as intended.
> OK, fixed in f05bce86af32d7b5cf1ab28b3abf6ee473bf3ef1.
Thank you!
================
Comment at: llvm/test/CodeGen/AMDGPU/perfhint.ll:87
+; GCN: MemoryBound: 0
+; GCN: WaveLimiterHint : 0
define amdgpu_kernel void @test_indirect_through_phi(float addrspace(1)* %arg) {
----------------
foad wrote:
> rampitec wrote:
> > foad wrote:
> > > This check fails. Perhaps D47740 never worked?
> > Looks like it did not :(
> >
> > Anyway, this case is not memory bound even though it is indirect. This is because we have a single load followed by multiple stores, that was the point of the check.
> The problem is that after AMDGPULowerKernelArguments, the load from %arg looks like this:
> ```
> %arg.load = load float addrspace(1)*, float addrspace(1)* addrspace(4)* %arg.kernarg.offset.cast, align 4, !invariant.load !0
> %load = load float, float addrspace(1)* %arg.load, align 8
> ```
> which is indirect. Any ideas?
A-ha! The representation changed, but we did not catch it because of the broken test.
The first load is from constant, so it is uniform. I suppose we can ignore constant address space for this purpose. It creates much less memory traffic.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D83862/new/
https://reviews.llvm.org/D83862
More information about the llvm-commits
mailing list