[PATCH] D154083: [AMDGPU] Rematerialize scalar loads
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 29 09:01:36 PDT 2023
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:108
bool SIInstrInfo::isReallyTriviallyReMaterializable(
const MachineInstr &MI) const {
----------------
piotr wrote:
> This is a pre-existent issue, but I am not very comfortable with the fact that we override `isReallyTriviallyReMaterializable` the way we do.
>
> It works because we rely on extra checks being present before the method is invoked (for example on calling `LiveRangeEdit::allUsesAvailableAt()`). However, the comment in the base class says `isReallyTriviallyReMaterializable` "must return false if (..) or if it requres any address registers that are not always available."
>
> Having said that, I am not sure if there is a practical solution for that (apart from softening the comment in the base class).
Yes, I don't like this either. The way the rematerializable hooks is structured is confusing and part of why this never got done before
================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:111
+ if (isVOP1(MI) || isVOP2(MI) || isVOP3(MI) || isSDWA(MI) || isSALU(MI) ||
+ (isSMRD(MI) && MI.isDereferenceableInvariantLoad())) {
// Normally VALU use of exec would block the rematerialization, but that
----------------
piotr wrote:
> Ironically, the addition of the check `isDereferenceableInvariantLoad` makes the examples that drive this work not handled, as the S_LOADs are not marked as `dereferenceable` at the moment (but that is another issue).
Don't really need the isSMRD with the isDereferenceableInvariantLoad. We should also be able to handle any load
================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:2316
+ unsigned SubIdx, const MachineInstr &Orig,
+ const TargetRegisterInfo &TRI) const {
+
----------------
Don't name TRI, should use RI member here
================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:2322-2323
+ switch (Opcode) {
+
+ // TODO: Handle more cases
+ case AMDGPU::S_LOAD_DWORDX16_IMM: {
----------------
Fix weird line breaks, comment before switch
================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:2328
+
+ if (I == MBB.end() || I->getOpcode() == AMDGPU::COPY)
+ break;
----------------
I'd hope we would have access to a live lane mask that we need to handle. Failing that, do you really need the copy opcode exception?
================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:2331
+
+ auto MO = I->findRegisterUseOperand(Orig.getOperand(0).getReg());
+ if (!MO || MO->getSubReg() == AMDGPU::NoSubRegister)
----------------
don't use auto here, or at least use a const *
================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:2335
+
+ unsigned Offset = TRI.getSubRegIdxOffset(MO->getSubReg());
+ unsigned SubregSize = TRI.getSubRegIdxSize(MO->getSubReg());
----------------
s/TRI/RI/
================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:2342
+
+ if (OrigSize == SubregSize)
+ break;
----------------
Don't think this can happen
================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:2354
+ MI->getOperand(0).setIsUndef(false);
+ MRI.setRegClass(DestReg, &AMDGPU::SGPR_256RegClass);
+
----------------
Can you go through getSubRegisterClass (possibly with getMatchingSuperRegClass and getSubClassWithSubReg) to avoid hardcoding this
================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:2362
+ getNamedOperand(*MI, AMDGPU::OpName::offset)->getImm();
+ unsigned FinalOffset = OrigOffset + Offset;
+ getNamedOperand(*MI, AMDGPU::OpName::offset)->setImm(FinalOffset);
----------------
keep this all as int64_t
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D154083/new/
https://reviews.llvm.org/D154083
More information about the llvm-commits
mailing list