[PATCH] D113291: [AggressiveInstCombine] Lower Table Based CTTZ

Djordje Todorovic via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Aug 21 03:06:30 PDT 2022


djtodoro marked an inline comment as done.
djtodoro added inline comments.


================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:493
+
+  return Matched == InputBits;
+}
----------------
dmgreen wrote:
> djtodoro wrote:
> > dmgreen wrote:
> > > If the length of the table is larger than the InputBits, how are we sure that the matched elements will be the correct ones? Is this always guaranteed?
> > > I think I would have expected a check that `for each i=0..InputBits-1, Table[(Mul<<i)>>Shift] == i`. With a check that the index is in range. Are they always equivalent with the larger tables too?
> > Hmmm, can you please walk me through out an example?
> Hmm. No, I'm not sure I can.
> 
> I was thinking about the ctz2 case, and whether there could be cases where the u's have different values that make them "match", but the values are different that make them wrong. So the items in the table accessed by the DeBruijn constant would produce incorrect values, but there are still InputBits number of matches.
> ```
> ;; int ctz2(unsigned x)
> ;; {
> ;; #define u 0
> ;;  static short table[64] =
> ;;   {
> ;;     32, 0, 1, 12, 2, 6, u, 13, 3, u, 7, u, u, u, u, 14,
> ;;     10, 4, u, u, 8, u, u, 25, u, u, u, u, u, 21, 27, 15,
> ;;     31, 11, 5, u, u, u, u, u, 9, u, u, 24, u, u, 20, 26,
> ;;     30, u, u, u, u, 23, u, 19, 29, u, 22, 18, 28, 17, 16, u
> ;;   };
> ;; x = (x & -x) * 0x0450FBAF;
> ;; return table[x >> 26];
> ;; }
> ```
> 
> But I don't think that is something that can come up. I was finding it hard to prove, but if the Mul is InputBits in length there are only at most InputBits separate elements that it can access. And multiple elements cannot map successfully back to the same i. 
> 
> I ran a sat solver overnight, and it is still going but hasn't found any counter examples, which is a good sign. (It is able to find valid DeBruijn CTTZ tables given the chance).
> 
> It might be worth adding a comment explaining why this correctly matches the table in all cases.
Yeah, good sign. I will try to make a reasonable comment. Thanks.


================
Comment at: llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp:547
+  LoadInst *LI = dyn_cast<LoadInst>(&I);
+  if (!LI) return false;
+
----------------
dmgreen wrote:
> One think I forgot to mention - llvm has a code style that is best explained as "just run clang-format on the patch". These returns are all in the wrong place, for example, and could do with a cleanup.
I've changed the style to `Google`, accidentally. Thanks.


================
Comment at: llvm/test/Transforms/AggressiveInstCombine/lower-table-based-cttz-dereferencing-pointer.ll:23
+
+define dso_local i32 @ctz6(i64* nocapture readonly %b) {
+; CHECK-LABEL: @ctz6(
----------------
dmgreen wrote:
> I usually remove dso_local
me as well, leftover :/ thanks!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113291/new/

https://reviews.llvm.org/D113291



More information about the llvm-commits mailing list