[PATCH] D116441: [AMDGPU][GlobalISel] Select op_sel modifiers for VOP3P
Mirko Brkusanin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 26 10:17:17 PST 2022
mbrkusanin marked an inline comment as not done.
mbrkusanin added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:3457
+ Mods ^= SISrcMods::OP_SEL_1;
+ Src = MI->getOperand(1).getReg();
+ }
----------------
foad wrote:
> I don't understand how this can be safe, when you never even look at MI->getOperand(2). Did you mean to test the mask elements for equality, e.g. `ShuffleMask[0] == 1` instead of `ShuffleMask[0] != 0`?
I mistakenly thought that ShuffleMask.size() == 2 would be enough of a restriction. That was a bug. Also, a lot more cases are covered now.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:3478
+ // Ignore HiSrc if it's undef.
+ if (MRI.getVRegDef(HiSrc)->getOpcode() == AMDGPU::G_IMPLICIT_DEF) {
+ if (NegLo)
----------------
foad wrote:
> Are there tests for this case? If HiSrc is undef then I am a bit surprised that we would ever match m_GShl(m_Reg(HiSrc), m_SpecificICst(16)) above.
>
> Also, do we need an equivalent case for when LoSrc is undef?
Cases where hi part is undef are common when operations on <N x 16> vectors with odd number of elements are broken down into multiple packed instructions. Arguments for last packed instruction will be undef for high part.
Added new test insert_with_undef which has case with both lo and hi as undef.
As for matching above,
these come from RegBankSelect when it breaks down G_BUILD_VECTOR(_TRUNC) for <2 x 16>.
It does not check if either low or high part is undef and everything matched above is always constructed.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D116441/new/
https://reviews.llvm.org/D116441
More information about the llvm-commits
mailing list