[PATCH] D155995: [AMDGPU]: Allow combining into v_dot4

Wed Oct 4 11:13:00 PDT 2023

jrbyrnes added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:12605
+    return IterIsSigned;
+  // If we have signed semantics with a MUL_U24 op, then fail
+  if (Src0.IsSigned.value_or(false) && MulOpcode == AMDGPUISD::MUL_U24)
----------------
bcahoon wrote:
> I think the check includes any extend (or unknown?) as well as signed?
> 
> Maybe say, If we have MUL_u24 without unsigned semantics, then fail.
> 
This check is checking for conflicting signedness semantics. In the case where we don't have signedness info from the ByteProvider, the signedness is irrelevant so we should say it doesn't conflict.

In the case where we don't have signedness information, then two things could have occurred:
	1. We are accumulating into 8 bit register, and have not done any extensions. In this case, the upper bits are irrelevant, and we may use either version of the dot.
	2. We have exclusively used any_extends. Same as case 1, the upper bits are irrelevant.

There is a third scenario which I have accounted for in the latest version. Previously, we would not have signedness info if we encountered an unhandled node (MemIntrinsic / AtomicSDNode). However, in this case, the upper bits may be relevant. Thus, instead of throwing away signedness info in this situations, we now fail.

================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:12609
+  // If we have unsigned semantics with a MUL_I24 op, then fail
+  if (Src0.IsSigned && !*Src0.IsSigned && MulOpcode == AMDGPUISD::MUL_I24)
+    return IterIsSigned;
----------------
bcahoon wrote:
> BTW, is this the same as :
> if (!Src0.IsSigned.value_or(false) && MulOpcode == AMDGPUISD::MUL_I24)
Right nice catch -- except we should not fail if we don't have signedness info.

================
Comment at: llvm/test/CodeGen/AMDGPU/idot4s.ll:146
+; GFX11-DL-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
+; GFX11-DL-NEXT:    v_dot4_i32_iu8 v0, v1, v0, s2 neg_lo:[1,1,0]
+; GFX11-DL-NEXT:    global_store_b32 v2, v0, s[0:1]
----------------
Answering offline question about "neg_lo:[1,1,0]"

These tests are introduced via this patch, and the modifier indicates we need signedness semantics for both operands. neg_lo:[1,1,0] was introduced originally by rebasing on top of changes from https://reviews.llvm.org/D158468?vs=552172&id=552466#toc I believe.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155995/new/

https://reviews.llvm.org/D155995