[PATCH] D46133: [InstCombine, ARM, AArch64] Convert table lookup to shuffle vector

Alexandros Lamprineas via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 27 01:56:00 PDT 2018


labrinea added inline comments.


================
Comment at: lib/Transforms/InstCombine/InstCombineCalls.cpp:938
+  unsigned NumElts = VecTy->getNumElements();
+  assert((NumElts == 8) && "Unexpected number of elements in shuffle mask!");
+
----------------
javed.absar wrote:
> Should we assert here or simply return nullptr?
I followed the same practice as other conversion functions of this file. We call simplifyTableLookup only when handling the Arm/AArch64 tbl1 intrinsics, so we certainly know that NumElts is 8. The assertion makes sure we don't call this routine from another context.


================
Comment at: lib/Transforms/InstCombine/InstCombineCalls.cpp:951
+
+      if (auto GV = dyn_cast<GlobalVariable>(V)) {
+        if (GV->isConstant() && GV->hasDefinitiveInitializer()) {
----------------
efriedma wrote:
> Please use llvm::ConstantFoldLoadFromConstPtr.
This won't work. ConstantFoldLoadFromConstPtr first tries ConstantFoldLoadThroughGEPConstantExpr. This returns just the first element of the [8xi8] array, since vld1 expects i8*, which is obtained by i8* getelementptr inbounds ([8 x i8], [8 x i8]* @big_endian_mask, i32 0, i32 0) in my reproducer.


================
Comment at: lib/Transforms/InstCombine/InstCombineCalls.cpp:974
+
+  // Check whether the mask matches the pattern { 7,6,5,4,3,2,1,0 }.
+  for (unsigned I = 0; I < NumElts; ++I) {
----------------
efriedma wrote:
> Why does the mask pattern matter?
Turning a table lookup with constant mask into a shuffle vector is not always beneficial. This particular pattern allows the back-end to generate byte reverse instructions instead of a table lookup, which is better. Generalising the transformation for every pattern probably doesn't hurt but it's not beneficial either. However, applying the transformation for larger table lookups (tbl2,tbl3,tbl4) results in worse codegen from the back-end.


================
Comment at: test/Transforms/InstCombine/AArch64/table-lookup.ll:2
+; RUN: opt -instcombine -S -o - %s | FileCheck %s
+
+target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
----------------
javed.absar wrote:
> It would be nice to add a comment here to explain the purpose of this test.
Sure, thanks.


https://reviews.llvm.org/D46133





More information about the llvm-commits mailing list