[PATCH] D22792: VecClone Pass

Tue Mar 20 06:19:43 PDT 2018

fpetrogalli added a comment.

Hi Matt - overall the patch looks good, I just have a couple of comments.

I think you should remove some of the testing you do on the vectorizer side of things in the vectorizer patch, not here. Here you should be testing only the cloning of the function.

Moreover, I would like to see more testing happening in the following cases:

1. when there is only //one// vector variant listed in the `vector-variants` attribute
2. when the vector variant listed in the attribute is not supported by the target and the cloning does not happen (say you are compiling for SSE but you only have AVX512 vector functions listed in the attribute).

Moreover, I have added one comment about using IR vector types instead of integers to maps ISAs to vector register sizes. That would probably make extending this pass to SVE easier because for vector length agnostic types we will need to do some symbolic computation on vector sizes.

Thank you,

Francesco

================
Comment at: include/llvm/Analysis/TargetTransformInfo.h:898
+  /// \returns The ISA class as a string.
+  std::string isaClassToString(int ISAClass) const;
+
----------------
Could you replace information about register size to use IR vector types? For example, you could use <16 x i8> for 128-bit wide registers, or <64 x i8> for 512-bit registers. You could then base all the computations of the VLEN of a vector function on that. I am asking because that will make easier the handling of Vector Length Agnostic (VLA) vector functions that we will end up using for targets like the AArch64 Scalable Vector Extension (SVE).

================
Comment at: test/Transforms/LoopVectorize/masked_simd_func.ll:1
+; Note: Test the simd function caller side functionality. The function side vectorization is tested under VecClone.
+
----------------
Any test that checks the caller side functionality should be be under the patch that enables the loop vectorizer to use the VecClone pass.

================
Comment at: test/Transforms/LoopVectorize/simd_func.ll:83
+
+attributes #0 = { noinline nounwind uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="core-avx2" "target-features"="+aes,+avx,+avx2,+bmi,+bmi2,+cx16,+f16c,+fma,+fsgsbase,+fxsr,+lzcnt,+mmx,+movbe,+pclmul,+popcnt,+rdrnd,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave,+xsaveopt" "unsafe-fp-math"="false" "use-soft-float"="false" "vector-variants"="_ZGVbN4vlu_dowork,_ZGVcN8vlu_dowork,_ZGVdN8vlu_dowork,_ZGVeN16vlu_dowork,_ZGVbM4vlu_dowork,_ZGVcM8vlu_dowork,_ZGVdM8vlu_dowork,_ZGVeM16vlu_dowork" }
+attributes #1 = { noinline nounwind uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="core-avx2" "target-features"="+aes,+avx,+avx2,+bmi,+bmi2,+cx16,+f16c,+fma,+fsgsbase,+fxsr,+lzcnt,+mmx,+movbe,+pclmul,+popcnt,+rdrnd,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave,+xsaveopt" "unsafe-fp-math"="false" "use-soft-float"="false" }
----------------
I think we should unit test each variant, not a single test that picks up one vector function from a list of "vector-variants".

https://reviews.llvm.org/D22792