[PATCH] D113448: [AMDGPU] Check for unneeded shift mask in shift PatFrags.

Abinav Puthan Purayil via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 19 00:43:45 PST 2021


abinavpp marked an inline comment as done.
abinavpp added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:3887
+
+  const MachineInstr *Opnd1Def =
+      getOpcodeDef(TargetOpcode::G_CONSTANT, MI.getOperand(2).getReg(), *MRI);
----------------
foad wrote:
> foad wrote:
> > Could use getIConstantVRegVal here to get the APInt value directly?
> `Opnd1` is a bit confusing because it's MI.getOperand(2). Maybe call it something vague like MaskVal?
For some reason, naming the operands as Val and MaskVal is confusing me, how about LHS and RHS?


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructions.td:261
+
 foreach width = [16, 32, 64] in {
 defvar mask = !sub(width, 1);
----------------
foad wrote:
> Maybe change this to `foreach logwidth = [4, 5, 6]` so you can put the definition of csh_mask_#logwidth inside the loop? Or maybe that's impossible because you need to refer to logwidth inside a C++ code fragment?
Right, I'm not able to refer them in the code fragments.


================
Comment at: llvm/test/CodeGen/AMDGPU/constrained-shift.ll:19
+; GISEL-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GISEL-NEXT:    v_and_b32_e32 v1, 15, v1
+; GISEL-NEXT:    v_lshlrev_b16_e32 v2, v1, v0
----------------
arsenm wrote:
> foad wrote:
> > abinavpp wrote:
> > > Do we see anything obvious in this change that's not allowing us to eliminate the `and` in global-isel for the divergent cases?
> > I think a cross-regbank copy is getting in the way of matching the constant value 15. Maybe use getIConstantVRegValWithLookThrough to look through the copy?
> This is another case where regbankselect or a the post regbank combiner should have materialized the constant in VGPR to begin with
> I think a cross-regbank copy is getting in the way of matching the constant value 15. Maybe use getIConstantVRegValWithLookThrough to look through the copy?

getIConstantVRegValWithLookThrough() in the predicate alone won't help here
since we're not able to match the pattern in the first place.


================
Comment at: llvm/test/CodeGen/AMDGPU/constrained-shift.ll:19
+; GISEL-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GISEL-NEXT:    v_and_b32_e32 v1, 15, v1
+; GISEL-NEXT:    v_lshlrev_b16_e32 v2, v1, v0
----------------
abinavpp wrote:
> arsenm wrote:
> > foad wrote:
> > > abinavpp wrote:
> > > > Do we see anything obvious in this change that's not allowing us to eliminate the `and` in global-isel for the divergent cases?
> > > I think a cross-regbank copy is getting in the way of matching the constant value 15. Maybe use getIConstantVRegValWithLookThrough to look through the copy?
> > This is another case where regbankselect or a the post regbank combiner should have materialized the constant in VGPR to begin with
> > I think a cross-regbank copy is getting in the way of matching the constant value 15. Maybe use getIConstantVRegValWithLookThrough to look through the copy?
> 
> getIConstantVRegValWithLookThrough() in the predicate alone won't help here
> since we're not able to match the pattern in the first place.
> This is another case where regbankselect or a the post regbank combiner should have materialized the constant in VGPR to begin with

Doing something like:
```
--- a/llvm/lib/CodeGen/GlobalISel/RegBankSelect.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/RegBankSelect.cpp
@@ -472,6 +472,10 @@ RegBankSelect::MappingCost RegBankSelect::computeMapping(
     Register Reg = MO.getReg();
     if (!Reg)
       continue;
+
+    if (MO.isUse() && isConstantOrConstantVector(*(MRI->getVRegDef(Reg)), *MRI))
+      continue;
+
     LLVM_DEBUG(dbgs() << "Opd" << OpIdx << '\n');
```
or forcing SGPRRegBank for constant operands in
AMDGPURegisterBankInfo::getDefaultMappingVOP() fixes this problem buts ends up
violating the constant bus restriction for a lot of AMDGPU tests.

I'm not sure how the original PatFrags (i.e. the ones with the masks as literal
constants without predicates) are working correctly in global-isel for *some*
(vector cases and scalar 64-bit cases are not working) of the divergent cases.

Is there a way to write a constant operand in a tblgen DAG that peeks through
trivial cross regbank copies? Or, is there a better way to fix this?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113448/new/

https://reviews.llvm.org/D113448



More information about the llvm-commits mailing list