[PATCH] D123835: AMDGPU/SDAG: Refine the fold to v_mad_[iu]64_[iu]32
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 26 06:25:16 PDT 2022
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:10748
+ SDValue Shift = DAG.getShiftAmountConstant(32, MVT::i64, SL);
+ SDValue AccumLo = DAG.getNode(ISD::TRUNCATE, SL, MVT::i32, Accum);
+ SDValue AccumHi = DAG.getNode(ISD::SRL, SL, MVT::i64, Accum, Shift);
----------------
foad wrote:
> I don't know if it makes any practical difference, but other code like `AMDGPUTargetLowering::LowerUDIVREM64` uses EXTRACT_ELEMENT to split an i64 into a pair of i32s, and BITCAST(BUILD_VECTOR ...) to reassemble them.
Using the shift adds extra steps. The combine on 64 bit shifts will turn this into the vector build
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D123835/new/
https://reviews.llvm.org/D123835
More information about the llvm-commits
mailing list