[PATCH] D140208: [AMDGPU] Improved wide multiplies

Mon Jan 9 04:53:01 PST 2023

tsymalla added a comment.

A few nits / questions, otherwise LGTM.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:2993
             unsigned j1 = DstIndex - j0;
+            bool AtLeastOneArgIsZero =
+                Src0KB[j0].isZero() || Src1KB[j1].isZero();
----------------
It looks like this is doing the same thing like in LOC 3040. Does it make sense to cache the results (is there anything computed when invoking `isZero()`)? You could also cache the pairs that are known to be zero, but this is just optional. I'd just think about re-implementing the same thing a few lines later.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:3180
+  const bool AtLeastOneSrcIsZero =
+      Helper.getKnownBits()->getKnownBits(Src0).isZero() ||
+      Helper.getKnownBits()->getKnownBits(Src1).isZero();
----------------
I'd still prefer to have the getter method `Helper::getKnownBits()` differently named. The one thing is returning a pointer to the known bits analysis, the other one is returning the known bits for a register.

================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/mul-known-bits.i64.ll:396
+;
+; GFX11-LABEL: v_mul_i64_partially_masked_src0:
+; GFX11:       ; %bb.0:
----------------
Can you please pre-commit these tests?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140208/new/

https://reviews.llvm.org/D140208