<div dir="ltr"><div><div><div>Hi,<br><br></div>So I've tried the Loop vectorizer and the SLP vectorizer (LLVM 3.3) on this code : (which is assigning 5 to each element of the array "%b")<br><br><br>; ModuleID = 'res.ll'<br>
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"<br>target triple = "x86_64-unknown-linux-gnu"<br>
<br>; Function Attrs: nounwind<br>define void @loop_ptr_19121568([10 x i32*]* nocapture %params_vec) #0 {<br>entry:<br> %0 = bitcast [10 x i32*]* %params_vec to double**<br> %temp_1 = load double** %0, align 8<br> %1 = getelementptr [10 x i32*]* %params_vec, i64 0, i64 1<br>
%2 = load i32** %1, align 8<br> %temp_2 = bitcast i32* %2 to double*<br> %3 = getelementptr [10 x i32*]* %params_vec, i64 0, i64 3<br> %4 = load i32** %3, align 8<br> %b = bitcast i32* %4 to double*<br> %5 = getelementptr [10 x i32*]* %params_vec, i64 0, i64 4<br>
%6 = load i32** %5, align 8<br> %d = bitcast i32* %6 to double*<br> %7 = getelementptr [10 x i32*]* %params_vec, i64 0, i64 6<br> %8 = load i32** %7, align 8<br> %temp_0 = bitcast i32* %8 to double*<br> %9 = getelementptr [10 x i32*]* %params_vec, i64 0, i64 7<br>
%10 = load i32** %9, align 8<br> %temp_4 = bitcast i32* %10 to i1*<br> %11 = getelementptr [10 x i32*]* %params_vec, i64 0, i64 8<br> %12 = load i32** %11, align 8<br> %temp_3 = bitcast i32* %12 to double*<br> %13 = getelementptr [10 x i32*]* %params_vec, i64 0, i64 9<br>
%14 = load i32** %13, align 8<br> %i = bitcast i32* %14 to double*<br> store double 1.000000e+00, double* %temp_0, align 8<br> %15 = load double* %temp_1, align 8<br> store double %15, double* %temp_2, align 8<br> %16 = load double* %d, align 8<br>
%17 = fmul double %16, %16<br> store double %17, double* %temp_3, align 8<br> %.pre = load double* %temp_0, align 8<br> %cmp_le1 = fcmp ole double %.pre, %17<br> store i1 %cmp_le1, i1* %temp_4, align 1<br> br i1 %cmp_le1, label %"i = temp_0", label %end_fun<br>
<br>"i = temp_0": ; preds = %entry, %"i = temp_0"<br> %18 = load double* %temp_0, align 8<br> store double %18, double* %i, align 8<br> %19 = fptoui double %18 to i32<br>
%20 = add i32 %19, -1<br> %21 = sext i32 %20 to i64<br> %22 = getelementptr double* %b, i64 %21<br> store double 5.000000e+00, double* %22, align 8<br> %23 = load double* %temp_0, align 8<br> %24 = load double* %temp_2, align 8<br>
%25 = fadd double %23, %24<br> store double %25, double* %temp_0, align 8<br> %.pre1 = load double* %temp_3, align 8<br> %cmp_le = fcmp ole double %25, %.pre1<br> store i1 %cmp_le, i1* %temp_4, align 1<br> br i1 %cmp_le, label %"i = temp_0", label %end_fun<br>
<br>end_fun: ; preds = %"i = temp_0", %entry<br> ret void<br>}<br><br>attributes #0 = { nounwind }<br><br>-------------------------------------------------------------------------------------------------------<br>
The loop vectorizer find the loop, but I don't exactly get the trouble with the loop exit count ..<br><br>LV: Checking a loop in "loop_ptr_19121568"<br>LV: Found a loop: i = temp_0<br>LV: SCEV could not compute the loop exit count.<br>
LV: Not vectorizing.<br><br></div><div>The SLP vectorizer debug :<br><br></div>SLP: Vectorizing a list of length = 2.<br>SLP: Cost of pair:1 Cost of extract:1.<br>SLP: Vectorizing a list of length = 2.<br>SLP: Cost of pair:1 Cost of extract:1.<br>
SLP: Found 4 stores to vectorize.<br>SLP: Vectorizing a list of length = 2.<br>SLP: Cost of pair:1 Cost of extract:1.<br>SLP: Vectorizing a list of length = 2.<br>SLP: Cost of pair:1 Cost of extract:1.<br>SLP: Found 4 stores to vectorize.<br>
<br><br></div>Thanks,<br>Matthieu<br><div><div><br><br></div></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Oct 16, 2013 at 8:59 PM, Nadav Rotem <span dir="ltr"><<a href="mailto:nrotem@apple.com" target="_blank">nrotem@apple.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word">Both the SLP vectorizer and the Loop vectorizer support vectorizing pointers. The attached code looks like a candidate for the SLP-vectorizer. Can you run the SLP-vectorizer with the flag -mllvm -debug-only=SLP and attach the log ? I think that we are missing the pattern for the roots of the tree. <div>
<br></div><div>Thanks,</div><div>Nadav<div><div class="h5"><br><div><br><div><div>On Oct 16, 2013, at 5:28 PM, Tom Stellard <<a href="mailto:tom@stellard.net" target="_blank">tom@stellard.net</a>> wrote:</div><br><blockquote type="cite">
<div style="font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">On Wed, Oct 16, 2013 at 11:14:06AM -0400, Matthieu Dubet wrote:<br>
<blockquote type="cite">Hi,<br><br>Thank you for the information,<br><br>So I'm now keeping the array as a pointer (i32*) but the vectorizer doesn't<br>vectorize it .<br><br>I've pasted the function code before and after optimization (and the list<br>
of optimization that I have activated) in this Gist :<br><a href="https://gist.github.com/maattd/7008683" target="_blank">https://gist.github.com/maattd/7008683</a><br><br>Some "weird" fact of my LLVM code :<br>
<br>* all variables (even the one used for the loop condition) are pointers to<br>memory allocated from the C world and passed to the LLVM functions as an<br>argument<br>* even with "opt->add(new llvm::DataLayout(*ee->getDataLayout())) ;" in the<br>
code, the module->dump() doesn't output neither data layout, nor triple<br>target<br><br>Both those points might confuse the vectorizer ?<br><br><br>On Fri, Oct 11, 2013 at 1:40 PM, Renato Golin <<a href="mailto:renato.golin@linaro.org" target="_blank">renato.golin@linaro.org</a>>wrote:<br>
<br><blockquote type="cite">On 11 October 2013 18:27, Matthieu Dubet <<a href="mailto:maattdd@gmail.com" target="_blank">maattdd@gmail.com</a>> wrote:<br><br><blockquote type="cite">How can I tell LLVM to consider this i32* as an <10 x i32> (and thus get<br>
the performance improvements thanks to SIMD ..etc..) ?<br><br></blockquote><br>Hi Matthieu,<br><br>You shouldn't need to do anything, the vectorizer should spot that for<br>you, if the machine you're compiling to has support for vector<br>
instructions. Any kind of vector operations that you may want to hard-code<br>will make it not work on anything other than the intrinsics/inline asm<br>you're using, which is not a good idea.<br><br></blockquote></blockquote>
<br>Which part of the vectorizer is responsible for doing pointer->vector transformations?<br><br>-Tom<br><br><blockquote type="cite"><blockquote type="cite">If your code didn't get vectorized, it's possible that it is not clear<br>
enough that that pointer is being iterated in a way that it's easy for the<br>vectorizer to spot, so maybe you need to make it clearer, and that depends<br>on the code in question. If you could share the code (or a similar example)<br>
with the list, people could help you spot the pattern and make it vectorize.<br><br>cheers,<br>--renato<br></blockquote></blockquote><br><br><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><br>
<blockquote type="cite">_______________________________________________<br>LLVM Developers mailing list<br><a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br></blockquote><br>_______________________________________________<br>LLVM Developers mailing list<br>
<a href="mailto:LLVMdev@cs.uiuc.edu" target="_blank">LLVMdev@cs.uiuc.edu</a><span> </span> <a href="http://llvm.cs.uiuc.edu/" target="_blank">http://llvm.cs.uiuc.edu</a><br><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a></div>
</blockquote></div><br></div></div></div></div></div></blockquote></div><br></div>