[PATCH] D159533: [DAG] getNode() - fold (zext (trunc x)) -> x iff the upper bits are known zero - add SRL support
Jeffrey Byrnes via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 20 12:19:04 PDT 2023
jrbyrnes added a comment.
> what is the best way to generalize the performOrCombine handling to support ISD::FSHR as well?
We should ideally produce AMDGPUISD::PERM whenever we have an i32 that is byte permutation of two sources (except some special cases) and it's best to do this via DAGCombining due to target-specific reasons. In practice, typically such byte permutations have ISD::OR as the root of the tree (which is why I have begun with that combine), however, other cases exist where the tree does not have ISD::OR as the root (e.g. AMDGPUISD::PERM, ISD::FSHR, etc). It seems to me, the best way to generalize is to extract the common code from ISD::OR, and use it to enable new roots / entry points -- this is an extension of the work we should do anyway. I hacked together an implementation of what I have just described and this patch no longer causes the v_perm miss regressions. @foad @arsenm objections?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D159533/new/
https://reviews.llvm.org/D159533
More information about the llvm-commits
mailing list