[PATCH] D29088: Do not create ctlz/cttz(X, false) when the target do not support zero defined ctlz/cttz.
Andrea Di Biagio via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 26 09:11:44 PST 2017
andreadb added a comment.
> Can you clarify why the flag is suboptimal by itself? The intrinsic carries the same semantic as the unfolded sequence, isn't it?
Yes. That is exactly what I originally pointed out in this thread.
> This seems to me like just a missing optimization here to recover that at this point: can't we just figure that %neg can't be zero and turn the flag to true?
I agree that we are currently missing an optimization.
That said, (if I remember correctly) the only place where we form cttz/ctlz with `is_zero_undef=false` is in `foldSelectCttzCtlz()` and the only goal of that transform is to canonicalize cttz/ctlz in preparation for codegen. That's why I suggested considering the possibility of moving that transform into CGP. If we do this, then we no longer need to add extra optimization rules to "fix" the fact that we prematurely canonicalized.
https://reviews.llvm.org/D29088
More information about the llvm-commits
mailing list