[PATCH] D113291: [AggressiveInstCombine] Lower Table Based CTTZ

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 23 04:11:49 PDT 2022


dmgreen added a comment.

Thanks for the updates. I don't think I have anything else than is what is below. Any other comments from anyone else?



================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:570-573
+  if (!GVTable)
+    return false;
+
+  if (!GVTable->hasInitializer())
----------------
```
if (!GVTable || !GVTable->hasInitializer())
  return false;
```


================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:606-608
+  Type *XType = X1->getType();
+  if (!XType->isIntegerTy(InputBits))
+    return false;
----------------
Remove this check, as it is always true as far as I can tell.


================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:493
+
+  return Matched == InputBits;
+}
----------------
djtodoro wrote:
> dmgreen wrote:
> > djtodoro wrote:
> > > dmgreen wrote:
> > > > If the length of the table is larger than the InputBits, how are we sure that the matched elements will be the correct ones? Is this always guaranteed?
> > > > I think I would have expected a check that `for each i=0..InputBits-1, Table[(Mul<<i)>>Shift] == i`. With a check that the index is in range. Are they always equivalent with the larger tables too?
> > > Hmmm, can you please walk me through out an example?
> > Hmm. No, I'm not sure I can.
> > 
> > I was thinking about the ctz2 case, and whether there could be cases where the u's have different values that make them "match", but the values are different that make them wrong. So the items in the table accessed by the DeBruijn constant would produce incorrect values, but there are still InputBits number of matches.
> > ```
> > ;; int ctz2(unsigned x)
> > ;; {
> > ;; #define u 0
> > ;;  static short table[64] =
> > ;;   {
> > ;;     32, 0, 1, 12, 2, 6, u, 13, 3, u, 7, u, u, u, u, 14,
> > ;;     10, 4, u, u, 8, u, u, 25, u, u, u, u, u, 21, 27, 15,
> > ;;     31, 11, 5, u, u, u, u, u, 9, u, u, 24, u, u, 20, 26,
> > ;;     30, u, u, u, u, 23, u, 19, 29, u, 22, 18, 28, 17, 16, u
> > ;;   };
> > ;; x = (x & -x) * 0x0450FBAF;
> > ;; return table[x >> 26];
> > ;; }
> > ```
> > 
> > But I don't think that is something that can come up. I was finding it hard to prove, but if the Mul is InputBits in length there are only at most InputBits separate elements that it can access. And multiple elements cannot map successfully back to the same i. 
> > 
> > I ran a sat solver overnight, and it is still going but hasn't found any counter examples, which is a good sign. (It is able to find valid DeBruijn CTTZ tables given the chance).
> > 
> > It might be worth adding a comment explaining why this correctly matches the table in all cases.
> Yeah, good sign. I will try to make a reasonable comment. Thanks.
A comment explaining this function would still be useful.


================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:552
+      LI->getPointerOperand()->getType()->isOpaquePointerTy()
+          ? LI->getType()
+          : LI->getPointerOperandType()->getNonOpaquePointerElementType();
----------------
djtodoro wrote:
> dmgreen wrote:
> > Can this always just use the Load type?
> I think yes, nothing comes up to my mind that can break it.
This can be changed to just the Load type then.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113291/new/

https://reviews.llvm.org/D113291



More information about the llvm-commits mailing list