[PATCH] D128911: Emit table lookup from TargetLowering::expandCTTZ()

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Jul 30 00:52:32 PDT 2022


dmgreen added inline comments.


================
Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:7960
+                                        unsigned BitWidth) const {
+  APInt DeBruijn = BitWidth == 32 ? APInt(32, 0x077CB531U)
+                                  : APInt(64, 0x0218A392CD3D5DBFULL);
----------------
If BitWidth isn't 32 or 64, we need to return SDValue.
We need to make sure the 64it value is tested too.


================
Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:7853
+    unsigned int Lshr = DeBrujin << i;
+    unsigned int Rshr = Lshr >> 27;
+    RshrArr[Rshr] = i;
----------------
gsocshubham wrote:
> craig.topper wrote:
> > gsocshubham wrote:
> > > dmgreen wrote:
> > > > gsocshubham wrote:
> > > > > dmgreen wrote:
> > > > > > I think this should technically be a rotl, because the DeBruijn sequences are cyclic. The top element in the sequence is always 0 though, so it probably doesn't make a difference in practice.
> > > > > > 
> > > > > > It may be better to us APInt for all the DeBrujin constants though. It is easier to be more precise with them, and it's useful when i64 constants are supported.
> > > > > I can use `DAG.getNode(ISD::ROTL, dl, VT, Op, DAG.getConstant(i, dl, SHVT));` instead of `<<` but this changes variable from `unsigned int` to `SDNode`.
> > > > That seems to be mixing up the instructions we want to produce in the output (DAG.getNode) and the calculations we are doing at compile time.
> > > > If DeBruijn is an APInt of the correct size `APInt DeBruijn(32, 0x077CB531U)`, then it will have a rotl method. The advantage of APInt is that they will also be able to work with other bitwidths, once those are added.
> > > Understood. I have used - `APInt DeBruijn(32, 0x077CB531U)`
> > > 
> > > Is it correct way? Can you suggest a way to simplify below usage of APInt?
> > Is it really cyclic? The multiply is a shift by a power of 2. So emulating the multiply should be a shl not rotl.
> Updated to `shl()`.
Yeah DeBruijn sequences are cyclic. They are explained here: https://en.wikipedia.org/wiki/De_Bruijn_sequence#Construction. The lower bits of the last elements use the wrapped-around upper bits of the constant. A rotate is more general than a shift, but for the constants we chose the upper value in the constant is always zero, so they become equivalent. If we are relying on a shift on the generated code, using a shift here too sounds OK.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128911/new/

https://reviews.llvm.org/D128911



More information about the llvm-commits mailing list