[PATCH] D134433: [AMDGPU][GISel] Enable Matching of V2S16 G_BUILD_VECTOR
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 23 05:48:42 PDT 2022
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCombine.td:91-95
+def const_build_vector_fold : GICombineRule<
+ (defs root:$buildvector, uint64_matchinfo:$matchinfo),
+ (match (wip_match_opcode G_BUILD_VECTOR, G_BUILD_VECTOR_TRUNC):$buildvector,
+ [{ return RegBankHelper.matchConstBuildVectorFold(*${buildvector}, ${matchinfo}); }]),
+ (apply [{ RegBankHelper.applyConstBuildVectorFold(*${buildvector}, ${matchinfo}); }])>;
----------------
This combine can be a separate patch
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:729
- .moreElementsIf(isSmallOddVector(0), oneMoreElement(0))
- .clampMaxNumElements(0, S16, 2)
- .clampScalar(0, S16, S64)
----------------
This shouldn't change
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:1448-1465
+ .bitcastIf(all(sizeIsMultipleOf32(VecTypeIdx),
+ scalarOrEltWiderThan(VecTypeIdx, 64)),
+ [=](const LegalityQuery &Query) {
+ // For > 64-bit element types, try to turn this into a
+ // 64-bit element vector since we may be able to do better
+ // indexing if this is scalar. If not, fall back to 32.
+ const LLT EltTy = Query.Types[EltTypeIdx];
----------------
This mostly looks like unwelcome clang-format changes. I think the only thing that really changed here was the S32->S16?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp:341-342
+ Register Dst = MI.getOperand(0).getReg();
+ if (MRI.getType(Dst) != LLT::fixed_vector(2, 16))
+ return false;
+
----------------
In the future we may want to handle <2 x i32> and <4 x i16>, at least for gfx940
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp:368-369
+ Register Dst = MI.getOperand(0).getReg();
+ const RegisterBank *DstRB = MRI.getRegBankOrNull(Dst);
+ assert(DstRB);
+
----------------
Probably should have an asserting getRegBank to go along with getRegClass instead of littering asserts like this
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp:373
+ MRI.setRegBank(TmpReg, *DstRB);
+ B.buildConstant(TmpReg, Value);
+ B.buildBitcast(Dst, TmpReg);
----------------
You can do auto Const = B.buildConstant(S32,...); MRI.setRegBank(Const.getReg(0))
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D134433/new/
https://reviews.llvm.org/D134433
More information about the llvm-commits
mailing list