[PATCH] D159533: [DAG] getNode() - fold (zext (trunc x)) -> x iff the upper bits are known zero - add SRL support

Wed Sep 20 12:19:04 PDT 2023

jrbyrnes added a comment.

> what is the best way to generalize the performOrCombine handling to support ISD::FSHR as well?

We should ideally produce AMDGPUISD::PERM whenever we have an i32 that is byte permutation of two sources (except some special cases) and it's best to do this via DAGCombining due to target-specific reasons. In practice, typically such byte permutations have ISD::OR as the root of the tree (which is why I have begun with that combine), however, other cases exist where the tree does not have ISD::OR as the root (e.g. AMDGPUISD::PERM, ISD::FSHR, etc). It seems to me, the best way to generalize is to extract the common code from ISD::OR, and use it to enable new roots / entry points -- this is an extension of the work we should do anyway. I hacked together an implementation of what I have just described and this patch no longer causes the v_perm miss regressions. @foad @arsenm objections?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159533/new/

https://reviews.llvm.org/D159533