[PATCH] D116441: [AMDGPU][GlobalISel] Select op_sel modifiers for VOP3P

Mirko Brkusanin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 26 10:17:17 PST 2022


mbrkusanin marked an inline comment as not done.
mbrkusanin added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:3457
+          Mods ^= SISrcMods::OP_SEL_1;
+        Src = MI->getOperand(1).getReg();
+      }
----------------
foad wrote:
> I don't understand how this can be safe, when you never even look at MI->getOperand(2). Did you mean to test the mask elements for equality, e.g. `ShuffleMask[0] == 1` instead of `ShuffleMask[0] != 0`?
I mistakenly thought that ShuffleMask.size() == 2 would be enough of a restriction. That was a bug. Also, a lot more cases are covered now.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:3478
+      // Ignore HiSrc if it's undef.
+      if (MRI.getVRegDef(HiSrc)->getOpcode() == AMDGPU::G_IMPLICIT_DEF) {
+        if (NegLo)
----------------
foad wrote:
> Are there tests for this case? If HiSrc is undef then I am a bit surprised that we would ever match m_GShl(m_Reg(HiSrc), m_SpecificICst(16)) above.
> 
> Also, do we need an equivalent case for when LoSrc is undef?
Cases where hi part is undef are common when operations on <N x 16> vectors with odd number of elements are broken down into multiple packed instructions. Arguments for last packed instruction will be undef for high part.

Added new test insert_with_undef which has case with both lo and hi as undef.

As for matching above,
these come from RegBankSelect when it breaks down G_BUILD_VECTOR(_TRUNC) for <2 x 16>. 
It does not check if either low or high part is undef and everything matched above is always constructed.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116441/new/

https://reviews.llvm.org/D116441



More information about the llvm-commits mailing list