[llvm-dev] Auto-vectorization option

Fri May 14 08:15:14 PDT 2021

On 5/14/2021 15:20, Sudakshina Dutta wrote:
> Dear Matt P. Dziubinski,
> 
> Thanks a lot for your reply. Although the vectorization is clearly 
> visible in godbolt, I could not generate it by command line. Does it 
> require some specific version of llvm/clang ?

No, only the -debug options require build with asserts; everything else 
should be working with a regular release of Clang/LLVM. Chances are you 
have to debug it (perhaps a missing side effect causing the entire loop 
to be optimized away, etc.).

Best,
Matt

> 
> Regards
> Sudakshina
> 
> On Fri, May 14, 2021 at 5:53 PM Matt P. Dziubinski <matdzb at gmail.com 
> <mailto:matdzb at gmail.com>> wrote:
> 
>     On 5/14/2021 13:30, Sudakshina Dutta via llvm-dev wrote:
>      > Dear all,
>      >
>      > Thanks to all of you. I have executed the following commands on
>     the code
>      > given above.
>      >
>      > clang -O3 -S -c find_max.c -Rpass=vector -Rpass-analysis=vector -o
>      > find_max.ll
>      >
>      > However, the generated code is an assembly code (attached). Is
>     there any
>      > way to generate a vectorized IR (.ll) file ?
> 
>     To get LLVM IR from the frontend (Clang) use -emit-llvm
>     -fno-discard-value-names, e.g., https://godbolt.org/z/aWz37qYdW
>     <https://godbolt.org/z/aWz37qYdW>
> 
>     If you don't need debugger intrinsics (llvm.dbg.*) add -g0, e.g.,
>     https://godbolt.org/z/aWz37qYdW <https://godbolt.org/z/aWz37qYdW>
> 
>     As Sjoerd has mentioned, passing -mllvm -print-before-all to Clang is
>     usedful to get pre-vectorized LLVM IR (as well as observe the
>     effects of
>     consecutive transformations); Example:
>     https://godbolt.org/z/4za6h6fqo <https://godbolt.org/z/4za6h6fqo>
> 
>     You can then extract the unoptimized LLVM IR and play with it in "opt"
>     (the middle-end optimizer tool) to get the LLVM IR optimized by the
>     middle-end passes (including loop vectorizer); note that now you can
>     just pass -print-before-all directly:
>     https://llvm.godbolt.org/z/P7E3PGE61
>     <https://llvm.godbolt.org/z/P7E3PGE61>
> 
>     In particular, the LLVM IR displayed under "*** IR Dump Before
>     LoopVectorizePass on _Z1fPim ***" is a good baseline for comparisons.
> 
>     Add "-mllvm -print-module-scope" to get the LLVM IR for the full module
>     (translation unit): https://godbolt.org/z/Go7zK8vsW
>     <https://godbolt.org/z/Go7zK8vsW>
> 
>     Then, pass this LLVM (right before LoopVectorizePass) to "opt" using
>     options "-loop-vectorize -debug-only=loop-vectorize" to observe the
>     loop
>     vectorization pass in action:
>     https://llvm.godbolt.org/z/WMa1qosoq
>     <https://llvm.godbolt.org/z/WMa1qosoq>
> 
>     Note that you need a binary built with assertions enabled to use -debug
>     options.
> 
>     Last but not least you can give the optimized LLVM IR to "llc" (the
>     backend tool) to get the final assembly:
>     https://llvm.godbolt.org/z/hxevcqKEG
>     <https://llvm.godbolt.org/z/hxevcqKEG>
> 
>     Best,
>     Matt
>