[PATCH] D110049: [AMDGPU] Correctly merge alias.scope and noalias metadata for memops

Brendon Cahoon via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 20 20:02:47 PDT 2021


bcahoon marked an inline comment as done.
bcahoon added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/lower-lds-struct-aa-memcpy.ll:20
+; CHECK-LABEL: @test
+; CHECK: store i8 3, i8 addrspace(3)* %0, align 4, !alias.scope !0, !noalias !3
+; CHECK: tail call void @llvm.memcpy.p3i8.p3i8.i64(i8 addrspace(3)* noundef align 1 dereferenceable(3) %2, i8 addrspace(3)* noundef align 1 dereferenceable(3) %1, i64 3, i1 false), !alias.scope !6, !noalias !7
----------------
arsenm wrote:
> rampitec wrote:
> > It is unclear from the test what are these !0, !3 etc values. Could you include it in the checks?
> I think update_test_checks support checking metadata nodes these days
I tried using update_test_checks, but no checks were added for the metadata. I'm not sure, but perhaps UpdateTestChecks doesn't support alias.scope and noalias metadata.  It looks like there is support for other metadata.


================
Comment at: llvm/test/CodeGen/AMDGPU/lower-lds-struct-aa-memcpy.ll:29
+  store i8 3, i8 addrspace(3)* getelementptr inbounds (%vec_type, %vec_type addrspace(3)* @_f1, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0), align 1
+  tail call void @llvm.memcpy.p3i8.p3i8.i64(i8 addrspace(3)* noundef align 1 dereferenceable(3) getelementptr inbounds (%vec_type, %vec_type addrspace(3)* @_f2, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0), i8 addrspace(3)* noundef align 1 dereferenceable(3) getelementptr inbounds (%vec_type, %vec_type addrspace(3)* @_f1, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0), i64 3, i1 false)
+  %0 = load i8, i8 addrspace(3)* getelementptr inbounds (%vec_type, %vec_type addrspace(3)* @_f2, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0), align 1
----------------
rampitec wrote:
> The original memcpy did not have any AA, we are merging AA produced by the same pass.
> 
> As I understand the problem is there is a single AA for the whole instruction, not for an operand. Clearly concat is not correct if we pass 2 different LDS pointers into a same call, but passing AA for only the first operand is not correct either, please disregard previous comment with respect to keeping the old AA value.
> 
> Using the intersect shall be fine, although on practice it will produce an empty list if pointers are to different fields. With that in mind the patch itself is correct, but please add metadata checks.
If there are more than two fields, then intersect wouldn't be empty.  For example, the noalias value for a load of f1 would include the scopes for f2 and f3. The noalias value for a load of f2 would includes the scopes for f1 and f3. Then, if there is a memcpy(f2, f1), then the noalias for the memcpy would be the scope for f3 (the intersection of {f1, f3} and {f2, f3}).

The main issue with existing alias.scope information is that the domain will always be different than the domain created for the LDS alias.scope metadata, which is an anonymous domain. So, the intersection of the LDS alias.scope domain with any previous domain will always be empty.


================
Comment at: llvm/test/CodeGen/AMDGPU/lower-lds-struct-aa-merge.ll:17
   %gep.a = getelementptr inbounds [64 x i32], [64 x i32] addrspace(3)* @a, i32 0, i32 %i
-  %val.a = load i32, i32 addrspace(3)* %gep.a, align 4, !alias.scope !0, !noalias !3, !tbaa !5
-  store i32 2, i32 addrspace(3)* getelementptr inbounds ([64 x i32], [64 x i32] addrspace(3)* @b, i32 0, i32 0), align 4, !alias.scope !3, !noalias !0, !tbaa !5
+  %val.a = load i32, i32 addrspace(3)* %gep.a, align 4,  !alias.scope !0, !noalias !3, !tbaa !5
+  store i32 2, i32 addrspace(3)* getelementptr inbounds ([64 x i32], [64 x i32] addrspace(3)* @b, i32 0, i32 0), align 4,  !alias.scope !3, !noalias !0, !tbaa !5
----------------
rampitec wrote:
> Can you remove these whitespace changes?
Thanks! That was unintentional.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110049/new/

https://reviews.llvm.org/D110049



More information about the llvm-commits mailing list