[PATCH] Implement aarch64 neon instruction class SIMD Table lookup - Clang

Wed Nov 13 16:02:40 PST 2013

  Hi Jiangning,

  I think this is correct now, so it's mostly nits. That switch feels like it could be simplified significantly, but we can probably postpone that for now.

  Cheers.

  Tim.


================
Comment at: lib/CodeGen/CGBuiltin.cpp:2465
@@ -2464,3 +2464,3 @@
 
-Value *CodeGenFunction::EmitAArch64BuiltinExpr(unsigned BuiltinID,
-                                                     const CallExpr *E) {
+static Value *packDVectorList(CodeGenFunction &CGF, ArrayRef<Value *> Ops,
+                              Value *ExtOp, Value *IndexOp, llvm::Type *ResTy,
----------------
This function now seems to be completely TBL orientated. Its name should probably reflect that.

================
Comment at: lib/CodeGen/CGBuiltin.cpp:2488-2491
@@ +2487,6 @@
+
+  // If there's an odd number of registers, the high half of the final Q is
+  // undef.
+  if (PairPos == End) {
+    Value *ZeroTbl = ConstantAggregateZero::get(TblTy);
+    TblOps.push_back(CGF.Builder.CreateShuffleVector(Ops[PairPos],
----------------
Comment doesn't match code.

================
Comment at: test/CodeGen/aarch64-neon-tbl.c:84-85
@@ +83,4 @@
+  return vtbx1_s8(a, b, c);
+  // CHECK: tbl {{v[0-9]+}}.8b, {{{v[0-9]+}}.16b}, {{v[0-9]+}}.8b
+  // CHECK: bsl {{v[0-9]+}}.8b, {{v[0-9]+}}.8b, {{v[0-9]+}}.8b
+}
----------------
I think more should be tested in the AArch32 emulation cases. It's important that the high half is 0 for vtbl and that an appropriate comparison takes place for vtbx.


http://llvm-reviews.chandlerc.com/D2070