[PATCH] D128252: [AMDGPU] Lowering VGPR to SGPR copies to v_readfirstlane_b32 if profitable.
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 13 10:46:07 PDT 2022
rampitec added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:946
+ // Next unique ID to use while new instance created.
+ static unsigned NextID;
+
----------------
You probably need to reset NextID to zero with each run of the pass. Better though make it a normal member of the Pass class itself.
================
Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:1148
+ TII->buildExtractSubReg(Result, *MRI, MI->getOperand(1), SrcRC,
+ TRI->getSubRegFromChannel(i), &AMDGPU::VGPR_32RegClass);
+ Register PartialDst =
----------------
Run clang-format again?
================
Comment at: llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp:1134
+ else
+ MIB.addReg(SrcReg);
+ } else {
----------------
alex-t wrote:
> rampitec wrote:
> > What happens to 16 bit subregs?
> VGPR to SGPR copies are inserted by InstrEmitter to adjust the VALU result to the SALU consumer.
> The 16bits in VGPR result are packed and adjusted to the consumer by inserting the EXCTRACT_ELEMENT lowered in another place.
> What kind of adjustment would you recommend if we have a 16bit VGPR source?
> Zero-extend it to 32bit?
>
Assume the input like:
```
%0:SGPR_LO16 = COPY %1.lo16:VGPR_32
```
If I read it right it will produce V_READFIRSTLANE_B32 with a 16 bit destination and source, which does not work. Assume that selection managed to produce such input, which path will it take here?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D128252/new/
https://reviews.llvm.org/D128252
More information about the llvm-commits
mailing list