[PATCH] D99427: [LoopIdiomRecognize] Teach CTLZ/CTTZ idiom recognition to handle not being able to find a pre-loop check for the input being 0.

Sun Mar 28 05:57:06 PDT 2021

spatel added a comment.

This makes sense to me, but I'm not as familiar with LIR as others, so might want to get a 2nd look.

================
Comment at: llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp:1781-1784
+    // If we didn't find the zero check, we need to assume we'll execute one
+    // iteration of the loop when the input is zero. If the input is zero, we
+    // need to select the initial value incremented/decremented by the count
+    // instruction's immediate operand.
----------------
It's not clear to me what cost model metrics we're using. Does this change the equation?

================
Comment at: llvm/test/Transforms/LoopIdiom/X86/ctlz.ll:259-260
 ; assume builtin is always profitable.
+; The C code here represents how other loop transformations and instcombine
+; may optimize the earlier ctlz_zero_check test case. The precondition check
+; for 0 is using the input to the absolute value pattern instead of the output.
----------------
Can you add a -O1 (or -O2?) test in a pre-commit to PhaseOrdering that starts from the IR for the actual C code we're trying to model? (could run it through -mem2reg if that makes it less obnoxious).

That way, we'll have coverage to make sure we can get through a full instcombine+LIR+instcombine sequence with the expected output.

================
Comment at: llvm/test/Transforms/LoopIdiom/X86/ctlz.ll:713-714

 ; We can't easily transform this loop. It returns 1 for an input of both
 ; 0 and 1.
 ;
----------------
But we do transform this loop now - update comment.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99427/new/

https://reviews.llvm.org/D99427