<div dir="ltr">Thank You.<div><br></div><div>If i execute the same vector sum code with greater number of iterations like 100000000000 will the non temporal loads and stores effective?</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jan 22, 2018 at 1:59 AM, Hal Finkel <span dir="ltr"><<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><span class="">
<p><br>
</p>
<div class="m_3078348663970828176moz-cite-prefix">On 01/20/2018 12:29 PM, hameeza ahmed
via llvm-dev wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">i have already seen usage of <span style="font-size:12.8px">__builtin_nontemporal_store but i
want to automate identification of non temporal loads/stores.
i think i need to go for a pass. is it possiblee to detect non
temporal loops without polly? <br>
</span></div>
</blockquote>
<br></span>
Yes, but we don't have anything that does that right now. The cost
modeling is non-trivial, however. In the loop below, which of those
accesses would you expect to be nontemporal? All of those accesses
span only 8 KB, and that's certainly smaller than many L1 caches.
Turning those into nontemporal accesses could certainly lead to a
performance regression for that loop, subsequent code, or both. If
we do this more generally, I suspect that we'd need to split the
loop so that small trip counts don't use them at all, and for larger
trip counts, we don't disturb data-reuse opportunities that would
otherwise exist.<br>
<br>
-Hal<br>
<br>
<blockquote type="cite"><div><div class="h5">
<div class="gmail_extra"><br>
<div class="gmail_quote">On Sat, Jan 20, 2018 at 11:26 PM, Simon
Pilgrim <span dir="ltr"><<a href="mailto:llvm-dev@redking.me.uk" target="_blank">llvm-dev@redking.me.uk</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div>
<div class="m_3078348663970828176h5"> On 20/01/2018 18:16, hameeza ahmed
wrote:<br>
<blockquote type="cite">
<div dir="ltr">Actually i am working on vector
accelerator which will perform those instructions
which are non temporal.
<div><br>
</div>
<div>for instance if i have this loop</div>
<div><br>
</div>
<div>for(i=0;i<2048;i++)</div>
<div>a[i]=b[i]+c[i];</div>
<div><br>
</div>
<div>currently it emits following IR;</div>
<div><br>
</div>
<div><br>
</div>
<div>
<div> %0 = getelementptr inbounds [2048 x i32],
[2048 x i32]* @b, i64 0, i64 %index<br>
</div>
<div> %1 = bitcast i32* %0 to <16 x i32>*</div>
<div> %wide.load = load <16 x i32>,
<16 x i32>* %1, align 16, !tbaa !1</div>
<div> %8 = getelementptr inbounds [2048 x i32],
[2048 x i32]* @c, i64 0, i64 %index</div>
<div> %9 = bitcast i32* %8 to <16 x i32>*</div>
<div> %wide.load14 = load <16 x i32>,
<16 x i32>* %9, align 16, !tbaa !1</div>
<div> %16 = add nsw <16 x i32>
%wide.load14, %wide.load</div>
<div> %20 = getelementptr inbounds [2048 x
i32], [2048 x i32]* @a, i64 0, i64 %index</div>
<div> %21 = bitcast i32* %20 to <16 x
i32>*</div>
<div> store <16 x i32> %16, <16 x
i32>* %21, align 16, !tbaa !1</div>
</div>
<div><br>
</div>
<div><br>
</div>
<div>However, i want it to emit following IR </div>
<div><br>
</div>
<div>
<div> %0 = getelementptr inbounds [2048 x i32],
[2048 x i32]* @b, i64 0, i64 %index<br>
</div>
<div> %1 = bitcast i32* %0 to <16 x i32>*</div>
<div> %wide.load = load <16 x i32>,
<16 x i32>* %1, align 16, !tbaa !1,
!nontemporal !1</div>
<div> %8 = getelementptr inbounds [2048 x i32],
[2048 x i32]* @c, i64 0, i64 %index</div>
<div> %9 = bitcast i32* %8 to <16 x i32>*</div>
<div> %wide.load14 = load <16 x i32>,
<16 x i32>* %9, align 16, !tbaa
!1, !nontemporal !1</div>
<div> %16 = add nsw <16 x i32>
%wide.load14, %wide.load, !nontemporal !1</div>
<div> %20 = getelementptr inbounds [2048 x
i32], [2048 x i32]* @a, i64 0, i64 %index</div>
<div> %21 = bitcast i32* %20 to <16 x
i32>*</div>
<div> store <16 x i32> %16, <16 x
i32>* %21, align 16, !tbaa !1, !nontemporal
!1</div>
</div>
<div><br>
</div>
<div>so that i can offload load, add, store to
accelerator hardware. is it possible here? do i
need a separate pass to detect whether the loop
has non temporal data or polly will help here?
what do you say?</div>
</div>
</blockquote>
</div>
</div>
From C/C++ you just need to use the
__builtin_nontemporal_store/__<wbr>builtin_nontemporal_load
builtins to tag the stores/loads with the nontemporal
flag.<br>
<br>
<div>for(i=0;i<2048;i++) {<br>
</div>
<div> __builtin_nontemporal_store(
__builtin_nontemporal_load(b+i<wbr>) +
__builtin_nontemporal_load(c + i), a + i );<br>
</div>
<div>}<br>
</div>
<br>
There may be an attribute you can tag pointers with
instead but I don't know off hand.<span><br>
<br>
<blockquote type="cite">
<div class="gmail_extra">On Sat, Jan 20, 2018 at 11:02
PM, Simon Pilgrim <span dir="ltr"><<a href="mailto:llvm-dev@redking.me.uk" target="_blank">llvm-dev@redking.me.uk</a>></span>
wrote:<br>
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="m_3078348663970828176m_-9084880504328883834HOEnZb">
<div class="m_3078348663970828176m_-9084880504328883834h5">On
20/01/2018 17:44, hameeza ahmed via llvm-dev
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Hello,<br>
<br>
My work deals with non-temporal loads and
stores i found non-temporal meta data in
llvm documentation but its not shown in
IR.<br>
<br>
How to get non-temporal meta data?<br>
</blockquote>
</div>
</div>
llvm\test\CodeGen\X86\nontempo<wbr>ral-loads.ll
shows how to create nt vector loads in IR - is
that what you're after?<span class="m_3078348663970828176m_-9084880504328883834HOEnZb"><font color="#888888"><br>
<br>
Simon.<br>
</font></span></blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</span></div>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset class="m_3078348663970828176mimeAttachmentHeader"></fieldset>
<br>
</div></div><pre>______________________________<wbr>_________________
LLVM Developers mailing list
<a class="m_3078348663970828176moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>
<a class="m_3078348663970828176moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><span class="HOEnZb"><font color="#888888">
</font></span></pre><span class="HOEnZb"><font color="#888888">
</font></span></blockquote><span class="HOEnZb"><font color="#888888">
<br>
<pre class="m_3078348663970828176moz-signature" cols="72">--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory</pre>
</font></span></div>
</blockquote></div><br></div>