[PATCH] D128911: Emit table lookup from TargetLowering::expandCTTZ()

Mon Jul 25 05:51:39 PDT 2022

dmgreen added inline comments.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:7795
+  APInt DeBruijn(32, 0x077CB531U);
+  APInt BitWidth(32, NumBitsPerElt);
+  APInt ShiftAmt(32, (BitWidth - BitWidth.logBase2()).getZExtValue());
----------------
gsocshubham wrote:
> dmgreen wrote:
> > Sorry the suggestion for using APInt wasn't very clear. The point is that DeBruijn isn't necessarily a 32bit quantity - there are other values for 8/16/64. We should add i64 at least, so long as they look beneficial (which they should for 64, not sure about the other values). APInt makes that simpler because it can store any size without overflow, and the operations are performed on the right size.
> > 
> > BitWidth and ShiftAmt can remain as unsigned - the values will always easily fit in an unsigned value. ShiftWidth can be `Bitwidth - Log2_32(BitWidth)`. I would rename the "NumBitsPerElt" argument to "Bitwidth" too, to make it clear what it is and avoid the need for the two.
> Here, do you mean to change `APInt DeBruijn(32, 0x077CB531U)` to `APInt DeBruijn(64, 0x0218A392CD3D5DBF)`?
> 
> `NumBitsPerElt` is 32 even in the case for `call i64 @llvm.cttz.i64(i64 %x, i1 true)`
It would be based on the BitWidth, providing that the BitWidth is known to be 32 or 64:
```
APInt DeBruijn = BitWidth == 32 ? APInt(32, 0x077CB531U) : APInt(64, 0x0218A392CD3D5DBFULL)
```

For some targets the i64 cttz will be legalized to a i32 cttz. It is for 64bit targets that the 64bit variant is useful.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:7829
+  Align Alignment = cast<ConstantPoolSDNode>(CPIdx)->getAlign();
+  return DAG.getLoad(
+      VT, dl, DAG.getEntryNode(), DAG.getMemBasePlusOffset(CPIdx, Lookup, dl),
----------------
gsocshubham wrote:
> dmgreen wrote:
> > The load should be loading MVT::i8, extended the result into VT.
> Can you please elaborate it? I did not understand it. Do you mean to change `VT` to `MVT::i8` in the return statement?
We want to create an array of i8 elements, load an i8 from it at the right index, and extend that to the original VT. That can either be done by creating a load of an i8 and calling DAG.getZExtOrTrunc - but that might introduce an illegal type where one cannot be created. So it may need to create a ZEXTLOAD load directly.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128911/new/

https://reviews.llvm.org/D128911