[llvm] [AMDGPU] Add basic patterns to select lshl_or instead of v_perm (PR #65693)

Jeffrey Byrnes via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 8 10:48:09 PDT 2023


jrbyrnes wrote:

It was an attempt to delete the `hasNon16BitAccesses` heuristic, but, after further thought, the heuristic is the correct approach.

This approach is not even correct as it stands: the upper 16 bits of the non-shifted source need to either be provably 0 or masked -- in the latter case, it is better to use a perm. It seems the better approach is to capture the last statement in the heuristic, and avoid / emit the perm accordingly.

computeKnownBits does know about AMDGPUPerm, but there is an outstanding issue.

https://github.com/llvm/llvm-project/pull/65693


More information about the llvm-commits mailing list