[PATCH] D87158: [AMDGPU] Fix for folding v2.16 literals.
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 8 10:09:50 PDT 2020
rampitec added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIFoldOperands.cpp:195
!(MI->getDesc().TSFlags & SIInstrFlags::IsMAI) &&
- AMDGPU::isInlinableLiteralV216(static_cast<uint16_t>(Fold.ImmToFold),
- ST.hasInv2PiInlineImm())) {
+ AMDGPU::isFoldableLiteralV216(Fold.ImmToFold,
+ ST.hasInv2PiInlineImm())) {
----------------
dfukalov wrote:
> rampitec wrote:
> > Seems like you still need to check if it is an inline literal, not just foldable. If it is foldable but not inline you can do it only with VOP3 literals available, in which case you do not need to play these games with op_sel at all.
> >
> > Then I also do not see where the check for vop3 literals is done here. Imagine you would get one and produce op_sel version of something foldable but not inline on GFX9.
> This case is about op_sel folding optimisation only (condition in line 193 checks the instruction has `SIInstrFlags::IsPacked` flag so it is VOP3). If a literal is not foldable with op_sel, it is processed by code below.
It is not sufficient to check for packed. Assume that literal is 0x200. It is foldable according to your check because high half is zero. Then the code as written will fold it. Same for 0x02000000 because low half is zero. But this is only legal to do on GFX10 which hasVOP3Literal(). As far as I understand it will also fold it on GFX9 where it is not legal. I do not see this legality check here.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D87158/new/
https://reviews.llvm.org/D87158
More information about the llvm-commits
mailing list