[PATCH] D142782: [AMDGPU] Add basic support for extended i8 perm matching
Jeffrey Byrnes via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 20 11:13:02 PDT 2023
jrbyrnes added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:9856
+ // depth
+ if (Depth >= 8)
+ return std::nullopt;
----------------
arsenm wrote:
> Why 8? Usually 6 is the one true recursion depth limit
Based on testing, I can lower depth, but we must accept depth 6 since this is the max tree depth across build vectors that should be lowered into v_perm. Relevant test is already included in permute_i8.ll. This depth may need to change in future iteration.
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:9955
+ return Op.getOpcode() == ISD::ZERO_EXTEND
+ ? std::optional<ByteProvider<SDValue>>(
+ ByteProvider<SDValue>::getConstantZero())
----------------
arsenm wrote:
> Do you really need the explicit std::optional?
Yes, it cannot infer std::optional, even after changing order.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D142782/new/
https://reviews.llvm.org/D142782
More information about the llvm-commits
mailing list