[PATCH] D140208: [AMDGPU] Improved wide multiplies
Thomas Symalla via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 9 04:53:01 PST 2023
tsymalla added a comment.
A few nits / questions, otherwise LGTM.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:2993
unsigned j1 = DstIndex - j0;
+ bool AtLeastOneArgIsZero =
+ Src0KB[j0].isZero() || Src1KB[j1].isZero();
----------------
It looks like this is doing the same thing like in LOC 3040. Does it make sense to cache the results (is there anything computed when invoking `isZero()`)? You could also cache the pairs that are known to be zero, but this is just optional. I'd just think about re-implementing the same thing a few lines later.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:3180
+ const bool AtLeastOneSrcIsZero =
+ Helper.getKnownBits()->getKnownBits(Src0).isZero() ||
+ Helper.getKnownBits()->getKnownBits(Src1).isZero();
----------------
I'd still prefer to have the getter method `Helper::getKnownBits()` differently named. The one thing is returning a pointer to the known bits analysis, the other one is returning the known bits for a register.
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/mul-known-bits.i64.ll:396
+;
+; GFX11-LABEL: v_mul_i64_partially_masked_src0:
+; GFX11: ; %bb.0:
----------------
Can you please pre-commit these tests?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D140208/new/
https://reviews.llvm.org/D140208
More information about the llvm-commits
mailing list