[llvm] AMDGPU/GlobalISel: add RegBankLegalize rules for bitfield extract (PR #132381)

Petar Avramovic via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 27 05:54:56 PDT 2025


================
@@ -225,6 +228,103 @@ void RegBankLegalizeHelper::lower(MachineInstr &MI,
     MI.eraseFromParent();
     break;
   }
+  case Div_BFE: {
+    Register Dst = MI.getOperand(0).getReg();
+    assert(MRI.getType(Dst) == LLT::scalar(64));
+    bool Signed = isa<GIntrinsic>(MI) ? MI.getOperand(1).getIntrinsicID() ==
+                                            Intrinsic::amdgcn_sbfe
+                                      : MI.getOpcode() == AMDGPU::G_SBFX;
+    unsigned FirstOpnd = isa<GIntrinsic>(MI) ? 2 : 1;
+    // Extract bitfield from Src, LSBit is the least-significant bit for the
+    // extraction (field offset) and Width is size of bitfield.
+    Register Src = MI.getOperand(FirstOpnd).getReg();
+    Register LSBit = MI.getOperand(FirstOpnd + 1).getReg();
+    Register Width = MI.getOperand(FirstOpnd + 2).getReg();
+    // Comments are for signed bitfield extract, similar for unsigned. x is sign
+    // bit. s is sign, l is LSB and y are remaining bits of bitfield to extract.
+
+    // Src >> LSBit Hi|Lo: x?????syyyyyyl??? -> xxxx?????syyyyyyl
+    unsigned SHROpc = Signed ? AMDGPU::G_ASHR : AMDGPU::G_LSHR;
+    auto SHRSrc = B.buildInstr(SHROpc, {{VgprRB, S64}}, {Src, LSBit});
+
+    auto ConstWidth = getIConstantVRegValWithLookThrough(Width, MRI);
+
+    // Expand to Src >> LSBit << (64 - Width) >> (64 - Width)
+    // << (64 - Width): Hi|Lo: xxxx?????syyyyyyl -> syyyyyyl000000000
+    // >> (64 - Width): Hi|Lo: syyyyyyl000000000 -> ssssssssssyyyyyyl
+    if (!ConstWidth) {
+      auto Amt = B.buildSub(VgprRB_S32, B.buildConstant(SgprRB_S32, 64), Width);
+      auto SignBit = B.buildShl({VgprRB, S64}, SHRSrc, Amt);
+      B.buildInstr(SHROpc, {Dst}, {SignBit, Amt});
+      MI.eraseFromParent();
+      return;
+    }
----------------
petar-avramovic wrote:

Did not give it too much thought, just copied original implementation.
Need 4 instructions either way, instructions are two shifts and sub, fourth can be either sub or shift. Shift is faster then sub?
```
SHRSrc = Src >> LSBit
Amt = 64 - Width
SignBit = SHRSrc << Amt
Res = SignBit >> Amt
```

```
AmtR = 64 - Width
AmtL = AmtR - LSBit
SignBit = Src << AmtL
Res = SignBit >> AmtR
```


https://github.com/llvm/llvm-project/pull/132381


More information about the llvm-commits mailing list