[PATCH] D100430: [AMDGPU][GlobalISel] Widen 1 and 2 byte scalar loads

Tue Apr 20 08:33:39 PDT 2021

foad added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp:1182
+        auto WideLoad = B.buildLoadFromOffset(S32, PtrReg, *MMO, 0);
+        auto Mask = B.buildConstant(
+            S32, APInt::getLowBitsSet(S32.getScalarSizeInBits(), MemSize));
----------------
vangthao wrote:
> foad wrote:
> > Use B.buildZExtInReg.
> I attempted to use B.buildZExtInReg but this function creates a new virtual register as the destination instead of using MI.getOperand(0)'s register when passing it as the first argument. This is unlike buildSExtInReg which uses the first argument as the destination.
Ah, that's a bug in the implementation:
```

--- a/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp
@@ -492,7 +492,7 @@ MachineInstrBuilder MachineIRBuilder::buildZExtInReg(const DstOp &Res,
   LLT ResTy = Res.getLLTTy(*getMRI());
   auto Mask = buildConstant(
       ResTy, APInt::getLowBitsSet(ResTy.getScalarSizeInBits(), ImmOp));
-  return buildAnd(ResTy, Op, Mask);
+  return buildAnd(Res, Op, Mask);
 }
 
 MachineInstrBuilder MachineIRBuilder::buildCast(const DstOp &Dst,
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100430/new/

https://reviews.llvm.org/D100430