[llvm] [AMDGPU] Optimizing Dynamic Alloca I-Sel (PR #124292)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Sun Jan 26 22:40:01 PST 2025
================
@@ -4088,15 +4088,12 @@ SDValue SITargetLowering::LowerDYNAMIC_STACKALLOC(SDValue Op,
DAG.getTargetConstant(Intrinsic::amdgcn_wave_reduce_umax, dl, MVT::i32);
Size = DAG.getNode(ISD::INTRINSIC_WO_CHAIN, dl, MVT::i32, WaveReduction,
Size, DAG.getConstant(0, dl, MVT::i32));
- SDValue ScaledSize = DAG.getNode(
- ISD::SHL, dl, VT, Size,
+ SDNode *ScaledSize = DAG.getMachineNode(
+ AMDGPU::S_LSHL_B32, dl, VT, Size,
DAG.getConstant(Subtarget->getWavefrontSizeLog2(), dl, MVT::i32));
- NewSP =
- DAG.getNode(ISD::ADD, dl, VT, BaseAddr, ScaledSize); // Value in vgpr.
- SDValue ReadFirstLaneID =
- DAG.getTargetConstant(Intrinsic::amdgcn_readfirstlane, dl, MVT::i32);
- NewSP = DAG.getNode(ISD::INTRINSIC_WO_CHAIN, dl, MVT::i32, ReadFirstLaneID,
- NewSP);
+ NewSP = {DAG.getMachineNode(AMDGPU::S_ADD_I32, dl, VT, BaseAddr,
----------------
arsenm wrote:
That is missing the divergent predicate it should have (e.g. see uses of DivergentBinFrag). Can you open a separate PR to fix that
https://github.com/llvm/llvm-project/pull/124292
More information about the llvm-commits
mailing list