[PATCH] D95539: [LV] Add analysis remark for mixed precision conversions

Tue Feb 16 13:27:30 PST 2021

fhahn added a comment.

Thanks for the update. I left a few comments with respect to the test case. I think it would be good to add a few more tests to cover all cases in the worklist loop, including using instructions defined outside the loop in the fpext and multiple stores that share a fpext.

================
Comment at: llvm/test/Transforms/LoopVectorize/mixed-precision-remarks.ll:2
+; RUN: opt -loop-vectorize -pass-remarks-analysis=loop-vectorize -disable-output < %s 2>&1 | FileCheck %s
+; ModuleID = 'mixed-precision.c'
+source_filename = "mixed-precision.c"
----------------
not needed

================
Comment at: llvm/test/Transforms/LoopVectorize/mixed-precision-remarks.ll:3
+; ModuleID = 'mixed-precision.c'
+source_filename = "mixed-precision.c"
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
----------------
not needed

================
Comment at: llvm/test/Transforms/LoopVectorize/mixed-precision-remarks.ll:5
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
----------------
If this requires the x86_64 triple, it should go into the `X86` subdirectory.

================
Comment at: llvm/test/Transforms/LoopVectorize/mixed-precision-remarks.ll:7
+
+; void f(float * restrict X, unsigned long N) {
+;   __builtin_assume(N > 32 && N % 32 == 0);
----------------
Personally I think the IR in the test case is concise and the C version should not be needed.

================
Comment at: llvm/test/Transforms/LoopVectorize/mixed-precision-remarks.ll:14
+; CHECK: remark: mixed-precision.c:4:26: floating point conversion changes vector width. Mixed floating point precision requires an up/down cast that will negatively impact performance.
+define dso_local void @f(float* noalias nocapture %X, i64 %N) local_unnamed_addr #0 !dbg !6 {
+entry:
----------------
`dso_local` and `local_unnamed_addr` are not needed. Is `#0` needed?

================
Comment at: llvm/test/Transforms/LoopVectorize/mixed-precision-remarks.ll:16
+entry:
+  %cmp = icmp ugt i64 %N, 32, !dbg !8
+  %rem = and i64 %N, 31, !dbg !9
----------------
none of the instructions in the block should be needed?

================
Comment at: llvm/test/Transforms/LoopVectorize/mixed-precision-remarks.ll:33
+  %conv3 = fptrunc double %mul to float, !dbg !19
+  store float %conv3, float* %arrayidx, align 4, !dbg !20, !tbaa !14
+  %inc = add nuw i64 %i.013, 1, !dbg !21
----------------
`!tbaa` should not be needed here and at other places.

================
Comment at: llvm/test/Transforms/LoopVectorize/mixed-precision-remarks.ll:36
+  %exitcond.not = icmp eq i64 %inc, %N, !dbg !22
+  br i1 %exitcond.not, label %for.cond.cleanup, label %for.body, !dbg !11, !llvm.loop !23
+}
----------------
`!llvm.loop` should not be needed.

================
Comment at: llvm/test/Transforms/LoopVectorize/mixed-precision-remarks.ll:49
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 12.0.0", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, enums: !2, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "mixed-precision.c", directory: "/tmp/mixed-precision.c")
----------------
Could you try to prune the metadata a bit? I think we would only need a location for the instruction we generate a remark for.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D95539/new/

https://reviews.llvm.org/D95539