[PATCH] D142782: [AMDGPU] Add basic support for extended i8 perm matching

Jeffrey Byrnes via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 20 11:13:02 PDT 2023


jrbyrnes added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:9856
+  // depth
+  if (Depth >= 8)
+    return std::nullopt;
----------------
arsenm wrote:
> Why 8? Usually 6 is the one true recursion depth limit
Based on testing, I can lower depth, but we must accept depth 6 since this is the max tree depth across build vectors that should be lowered into v_perm. Relevant test is already included in permute_i8.ll. This depth may need to change in future iteration. 


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:9955
+      return Op.getOpcode() == ISD::ZERO_EXTEND
+                 ? std::optional<ByteProvider<SDValue>>(
+                       ByteProvider<SDValue>::getConstantZero())
----------------
arsenm wrote:
> Do you really need the explicit std::optional?
Yes, it cannot infer std::optional, even after changing order.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142782/new/

https://reviews.llvm.org/D142782



More information about the llvm-commits mailing list