<div dir="ltr"><div>Dear Matt P. Dziubinski,</div><div><br></div><div>Thanks a lot for your reply. Although the vectorization is clearly visible in godbolt, I could not generate it by command line. Does it require some specific version of llvm/clang ?</div><div><br></div><div>Regards</div><div>Sudakshina<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, May 14, 2021 at 5:53 PM Matt P. Dziubinski <<a href="mailto:matdzb@gmail.com">matdzb@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 5/14/2021 13:30, Sudakshina Dutta via llvm-dev wrote:<br>

> Dear all,<br>

> <br>

> Thanks to all of you. I have executed the following commands on the code <br>

> given above.<br>

> <br>

> clang -O3 -S -c find_max.c -Rpass=vector -Rpass-analysis=vector -o <br>

> find_max.ll<br>

> <br>

> However, the generated code is an assembly code (attached). Is there any <br>

> way to generate a vectorized IR (.ll) file ?<br>

<br>

To get LLVM IR from the frontend (Clang) use -emit-llvm <br>

-fno-discard-value-names, e.g., <a href="https://godbolt.org/z/aWz37qYdW" rel="noreferrer" target="_blank">https://godbolt.org/z/aWz37qYdW</a><br>

<br>

If you don't need debugger intrinsics (llvm.dbg.*) add -g0, e.g., <br>

<a href="https://godbolt.org/z/aWz37qYdW" rel="noreferrer" target="_blank">https://godbolt.org/z/aWz37qYdW</a><br>

<br>

As Sjoerd has mentioned, passing -mllvm -print-before-all to Clang is <br>

usedful to get pre-vectorized LLVM IR (as well as observe the effects of <br>

consecutive transformations); Example: <a href="https://godbolt.org/z/4za6h6fqo" rel="noreferrer" target="_blank">https://godbolt.org/z/4za6h6fqo</a><br>

<br>

You can then extract the unoptimized LLVM IR and play with it in "opt" <br>

(the middle-end optimizer tool) to get the LLVM IR optimized by the <br>

middle-end passes (including loop vectorizer); note that now you can <br>

just pass -print-before-all directly: <a href="https://llvm.godbolt.org/z/P7E3PGE61" rel="noreferrer" target="_blank">https://llvm.godbolt.org/z/P7E3PGE61</a><br>

<br>

In particular, the LLVM IR displayed under "*** IR Dump Before <br>

LoopVectorizePass on _Z1fPim ***" is a good baseline for comparisons.<br>

<br>

Add "-mllvm -print-module-scope" to get the LLVM IR for the full module <br>

(translation unit): <a href="https://godbolt.org/z/Go7zK8vsW" rel="noreferrer" target="_blank">https://godbolt.org/z/Go7zK8vsW</a><br>

<br>

Then, pass this LLVM (right before LoopVectorizePass) to "opt" using <br>

options "-loop-vectorize -debug-only=loop-vectorize" to observe the loop <br>

vectorization pass in action:<br>

<a href="https://llvm.godbolt.org/z/WMa1qosoq" rel="noreferrer" target="_blank">https://llvm.godbolt.org/z/WMa1qosoq</a><br>

<br>

Note that you need a binary built with assertions enabled to use -debug <br>

options.<br>

<br>

Last but not least you can give the optimized LLVM IR to "llc" (the <br>

backend tool) to get the final assembly: <br>

<a href="https://llvm.godbolt.org/z/hxevcqKEG" rel="noreferrer" target="_blank">https://llvm.godbolt.org/z/hxevcqKEG</a><br>

<br>

Best,<br>

Matt<br>

</blockquote></div>