[llvm] [NVPTX] support packed f32 instructions for sm_100+ (PR #126337)
Alex MacLean via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 10 09:30:41 PDT 2025
================
@@ -3595,6 +3635,9 @@ NVPTXTargetLowering::LowerReturn(SDValue Chain, CallingConv::ID CallConv,
const auto VectorInfo = VectorizePTXValueVTs(VTs, Offsets, RetAlign);
unsigned I = 0;
for (const unsigned NumElts : VectorInfo) {
+ // amount to subdivide. If not v2f32, we don't consider packing
+ const unsigned PackingAmt = VTs[I] == MVT::v2f32 ? 2 : 1;
----------------
AlexMaclean wrote:
Why only do this for v2f32?
https://github.com/llvm/llvm-project/pull/126337
More information about the llvm-commits
mailing list