[PATCH] D54351: [AMDGPU] combine extractelement into several selects

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Nov 10 14:25:39 PST 2018


arsenm added inline comments.


================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:8057-8058
 
+  // EXTRACT_VECTOR_ELT (<n x e>, var-idx) => n x select (e, const-idx)
+  // This elminates non-constant index and subsequent movrel or scratch access.
+  // Sub-dword vectors of size 2 dword or less have better implementation.
----------------
Is this a combine instead of custom lowering to handle illegal typed vectors?


================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:8060-8061
+  // Sub-dword vectors of size 2 dword or less have better implementation.
+  // Vectors of size bigger than 8 dwords would yield too much v_cndmask_b32
+  // instructions.
+  if (VecSize <= 256 && (VecSize > 64 || EltSize >= 32) &&
----------------
Grammar, too many


================
Comment at: test/CodeGen/AMDGPU/extract_vector_dynelt.ll:2
+; RUN: llc -march=amdgcn -mcpu=fiji -verify-machineinstrs < %s | FileCheck -enable-var-scope -check-prefixes=GCN %s
+
+; GCN-LABEL: {{^}}float4_extelt:
----------------
Should have some 8 and 16-bit element vectors (and 1-bit since those always break things)


https://reviews.llvm.org/D54351





More information about the llvm-commits mailing list