[PATCH] D157012: AMDGPU/GlobalISel: insert readfirstlane on SGPR inlineasm copy

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 3 10:32:30 PDT 2023


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInlineAsmLowering.cpp:29-30
+
+bool AMDGPUInlineAsmLowering::buildAnyextOrCopy(
+    Register Dst, Register Src, MachineIRBuilder &MIRBuilder) const {
+
----------------
Seems like a place that accidentally works


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInlineAsmLowering.cpp:38
+  auto SrcTy = MRI.getType(Src);
+  if (!SrcTy.isValid()) {
+    LLVM_DEBUG(dbgs() << "Source type for copy is not valid\n");
----------------
This should only happen for physical registers which you also need to handle


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInlineAsmLowering.cpp:66
+                      .buildIntrinsic(Intrinsic::amdgcn_readfirstlane,
+                                      {MRI.getType(Src)})
+                      .addReg(Src);
----------------
Already have SrcTy above. Also this won't work for any tuple type. You need to split up any larger registers 


================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-inline-asm.ll:299
+entry:
+  %0 = load i32, ptr addrspace(1) %in, align 4
+  %1 = call { i64, i64 } asm sideeffect "v_mad_u64_u32 $0, $1, $2, $3, $4", "=v,=s,r,v,v"(i32 %0, i32 poison, i64 poison) #13
----------------
Don't use anonymous values


================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-inline-asm.ll:300
+  %0 = load i32, ptr addrspace(1) %in, align 4
+  %1 = call { i64, i64 } asm sideeffect "v_mad_u64_u32 $0, $1, $2, $3, $4", "=v,=s,r,v,v"(i32 %0, i32 poison, i64 poison) #13
+  ret void
----------------
Better to have real input values


================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-inline-asm.ll:303
+}
+
 define void @test_many_matching_constraints(i32 %a, i32 %b, i32 %c) nounwind {
----------------
This needs a full range of tested types. You need at least i16, half, <2 x i16>, <2 x half>, pointers of various address spaces, and vectors of all of those


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157012/new/

https://reviews.llvm.org/D157012



More information about the llvm-commits mailing list