[PATCH] D53927: [AArch64] Enable libm vectorized functions via SLEEF

Wed Feb 13 04:46:15 PST 2019

rengolin added a comment.
Herald added a subscriber: jdoerfert.

Hi,

I have a few comments inline, mostly harmless, as the change is almost entirely mechanical and makes sense.

I'm happy with the change as is, no need to split (as I also can't see how), after we address all the comments.

That'd also give some time for other people to review the new changes.

cheers,
--renato

================
Comment at: include/llvm/Analysis/TargetLibraryInfo.def:315
+/// double __gamma_r_finite(double x, int* s);
+TLI_DEFINE_ENUM_INTERNAL(tgamma_finite)
+TLI_DEFINE_STRING_INTERNAL("__gamma_r_finite")
----------------
Probably a silly question but, why "tgamma" when the function name is "gamma"?

================
Comment at: include/llvm/IR/RuntimeLibcalls.def:258

+// New Intrinsics.
+HANDLE_LIBCALL(ACOS_F32, "acosf")
----------------
Looks like someone should one day look into generating this file, an OPcode section and TargetLibraryInfo from the intrinsics table-gen... Not now, though.

================
Comment at: lib/Analysis/TargetLibraryInfo.cpp:368
   default:
-    TLI.setUnavailable(LibFunc_exp10);
-    TLI.setUnavailable(LibFunc_exp10f);
-    TLI.setUnavailable(LibFunc_exp10l);
+    if (T.getArch() != Triple::aarch64) {
+      TLI.setUnavailable(LibFunc_exp10);
----------------
This would probably fail in many non-Arm64 arches, too. 

Best to restrict to what this was meant for, which I'm guessing is x86.

================
Comment at: lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:1108
   case ISD::STRICT_FLOG2:
+  // New Intrinsics.
+  case ISD::STRICT_FACOS:
----------------
Please clean up all "new intrinsics" comments, as they're not necessary for the patch.

I imagine they were helpful while developing... :)

================
Comment at: lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:4952
+// expandExperimentalIntrinsics - return the canonical non-vector libm
+// function name for the experimental Intrinsics.
+static const char*
----------------
I think we can have these in the same place as the others, but with an "experimental" comment before.

The main idea behind the experimental tag is so that people don't rely on it until it's complete, but once it is, they can.

So if we create them in different places, we'd have to move them in once they're no longer experimental and that would be a pain.

I'm not expecting this to be in experimental for too long and am hoping to get this official by the next release (july-ish 2019), so, let's keep the experimental status on comments only and then remove the comments when everything is working as expected.

================
Comment at: test/Transforms/LoopVectorize/AArch64/sleef-calls-aarch64-fast.ll:1
+; Do NOT use -O3. It will lower exp2 to ldexp, and the test will fail.
+; RUN: opt -vector-library=SLEEF -loop-unroll -loop-vectorize -S < %s | FileCheck %s
----------------
We don't generally recommend O3 in tests anyway, as this introduces test noise when the passes in O3 change (add/del/move around).

================
Comment at: test/Transforms/LoopVectorize/AArch64/sleef-calls-aarch64-fast.ll:3
+; RUN: opt -vector-library=SLEEF -loop-unroll -loop-vectorize -S < %s | FileCheck %s
+; RUN: opt -vector-library=SLEEF -loop-unroll -loop-vectorize -fp-contract=fast -S < %s | FileCheck %s
+
----------------
does fp-contract=fast really make a difference for these transformations?

================
Comment at: test/Transforms/LoopVectorize/AArch64/sleef-calls-aarch64-fast.ll:6
+target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
+target triple = "aarch64-unknown-linux-gnu"
+
----------------
is this really specific to Arm64? At least the generation could be done for any target, if the library supports it or not is not relevant for this test, no?

================
Comment at: test/Transforms/LoopVectorize/AArch64/sleef-calls-aarch64.ll:3
+; RUN: opt -vector-library=SLEEF -loop-unroll -loop-vectorize -S < %s | FileCheck %s
+; RUN: opt -vector-library=SLEEF -loop-unroll -loop-vectorize -fp-contract=fast -S < %s | FileCheck %s
+
----------------
What's the difference between these two test files?

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D53927/new/

https://reviews.llvm.org/D53927