<div dir="ltr"><div>Hello Michael,</div><div>Very sorry for the late reply, we had exams and assignments this week and I had to read about _builtin_assume_aligned as I didn't come across this.<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>#pragma clang loop vectorize_assume_alignment(32)<br> for(int i = 0;i < n; i++){<br> a[i] = b[i] + i*i;<br> }</div></blockquote><div> for this all-access inside the loop will be aligned to 32bit, <br></div><div>ex IR <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>for.cond: ; preds = %for.inc, %entry<br> %5 = load i32, i32* %i, align 32, !llvm.access.group !2<br> %6 = load i32, i32* %n, align 32, !llvm.access.group !2<br> %cmp = icmp slt i32 %5, %6<br> br i1 %cmp, label %for.body, label %for.end<br><br>for.body: ; preds = %for.cond<br> %7 = load i32, i32* %i, align 32, !llvm.access.group !2<br> %8 = load i32, i32* %i, align 32, !llvm.access.group !2<br> %idxprom = sext i32 %8 to i64<br> %arrayidx = getelementptr inbounds i32, i32* %vla1, i64 %idxprom<br> store i32 %7, i32* %arrayidx, align 32, !llvm.access.group !2<br> br label %for.inc<br><br>for.inc: ; preds = %for.body<br> %9 = load i32, i32* %i, align 32, !llvm.access.group !2<br> %inc = add nsw i32 %9, 1<br> store i32 %inc, i32* %i, align 32, !llvm.access.group !2<br> br label %for.cond, !llvm.loop !3<br></div></blockquote><div>You will not need to create pointers for every array(or operand you want to perform the operation on). <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><span style="font-size:16.6043px;font-family:monospace">void mult(float* x, int size, float factor)</span><span style="font-size:16.6043px;font-family:monospace">{</span><span style="font-size:16.6043px;font-family:monospace"><br></span></div><div><span style="font-size:16.6043px;font-family:monospace"> float* ax = (float*)__builtin_assume_aligned(x, 64);</span><span style="font-size:16.6043px;font-family:monospace"><br></span></div><div><span style="font-size:16.6043px;font-family:monospace"> for (int i = 0; i < size; ++i)</span></div><div><span style="font-size:16.6043px;font-family:monospace"> </span><span style="font-size:16.6043px;font-family:monospace">ax[i] *= factor;</span><span style="font-size:16.6043px;font-family:monospace"><br></span></div><div><span style="font-size:16.6043px;font-family:monospace">}</span></div></blockquote><div>the IR generated for this :</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div> define void @mult(i32*, i32, float) #0 {<br> %4 = alloca i32*, align 8<br> %5 = alloca i32, align 4<br> %6 = alloca float, align 4<br> %7 = alloca i32*, align 8<br> %8 = alloca i32, align 4<br> store i32* %0, i32** %4, align 8<br> store i32 %1, i32* %5, align 4<br> store float %2, float* %6, align 4<br> %9 = load i32*, i32** %4, align 8<br> %10 = bitcast i32* %9 to i8*<br> %11 = ptrtoint i8* %10 to i64<br> %12 = and i64 %11, 63<br> %13 = icmp eq i64 %12, 0<br> call void @llvm.assume(i1 %13)<br> %14 = bitcast i8* %10 to i32*<br> store i32* %14, i32** %7, align 8<br> store i32 0, i32* %8, align 4<br> br label %15<br><br>; <label>:15: ; preds = %29, %3<br> %16 = load i32, i32* %8, align 4<br> %17 = load i32, i32* %5, align 4<br> %18 = icmp slt i32 %16, %17<br> br i1 %18, label %19, label %32<br><br>; <label>:19: ; preds = %15<br> %20 = load float, float* %6, align 4<br> %21 = load i32*, i32** %7, align 8<br> %22 = load i32, i32* %8, align 4<br> %23 = sext i32 %22 to i64<br> %24 = getelementptr inbounds i32, i32* %21, i64 %23<br> %25 = load i32, i32* %24, align 4<br> %26 = sitofp i32 %25 to float<br> %27 = fmul float %26, %20<br> %28 = fptosi float %27 to i32<br> store i32 %28, i32* %24, align 4<br> br label %29<br><br>; <label>:29: ; preds = %19<br> %30 = load i32, i32* %8, align 4<br> %31 = add nsw i32 %30, 1<br> store i32 %31, i32* %8, align 4<br> br label %15<br><br>; <label>:32: ; preds = %15<br> ret void<br>}<br></div></blockquote><div>the alignment is assumed whereas in #pragma it is set to the number specified. <br></div><div><span style="font-size:16.6043px;font-family:monospace"><font size="2"><font face="arial,sans-serif">it'll be easier, and having a pragma for doing this will help as it's provided in OMP and intel compilers. <br></font></font></span></div><div><span style="font-size:16.6043px;font-family:monospace"><font size="2"><font face="arial,sans-serif">Thank you, If I made any mistake please tell me.</font></font></span></div><div><span style="font-size:16.6043px;font-family:monospace"><font size="2"><font face="arial,sans-serif"><br></font></font></span></div><div><span style="font-size:16.6043px;font-family:monospace"><font size="2"><font face="arial,sans-serif">Happy Mahto</font></font></span></div><div><span style="font-size:16.6043px;font-family:monospace"><font size="2"><font face="arial,sans-serif">CSE Undergrad, IIT Hyderabad</font></font></span></div><div><span style="font-size:16.6043px;font-family:monospace"><br></span></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Nov 14, 2019 at 10:32 PM Michael Kruse via Phabricator <<a href="mailto:reviews@reviews.llvm.org">reviews@reviews.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Meinersbur added a comment.<br>
<br>
Could you elaborate why this is better than `__builtin_assume_aligned`?<br>
<br>
<br>
Repository:<br>
rG LLVM Github Monorepo<br>
<br>
CHANGES SINCE LAST ACTION<br>
<a href="https://reviews.llvm.org/D69897/new/" rel="noreferrer" target="_blank">https://reviews.llvm.org/D69897/new/</a><br>
<br>
<a href="https://reviews.llvm.org/D69897" rel="noreferrer" target="_blank">https://reviews.llvm.org/D69897</a><br>
<br>
<br>
<br>
</blockquote></div>