[llvm] [AMDGPU]: Accept constant zero bytes in v_perm OrCombine (PR #66533)

Jeffrey Byrnes via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 21 14:44:47 PST 2024


jrbyrnes wrote:

Latest iteration adds a heuristic which disables perm combine if the two operands are i8s (either actual i8s or post legalization). This type of perm can always be represented as lshl_or. 

I have also updated the heuristic such that we bypass it if the subtarget generation doesn't support lshl_or. This bypass has introduced a set of lit changes (usually in the form of VI checks) that are independent of the constant zero work. These changes occur since we are allowing the perm combine to occur more often.

https://github.com/llvm/llvm-project/pull/66533


More information about the llvm-commits mailing list