<div dir="ltr">Thanks, David, I understand. Then, is there a way of disabling generating the llvm. intrinsics? opt seems to have an option called -disable-simplify-libcalls. However, in my case, it does not remove the llvm.memcpy instruction from the bitcode.<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Feb 11, 2016 at 6:04 PM, David Blaikie <span dir="ltr"><<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">There probably is a rule, but I don't know what it is - I would imagine memcpy is used when storing a whole aggregate (but then you'll get into ABI issues, etc - maybe if the struct contains only a single primitive type it just switches to a store, etc).</div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Feb 11, 2016 at 8:44 AM, Simona Simona <span dir="ltr"><<a href="mailto:other.dev.simona@gmail.com" target="_blank">other.dev.simona@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div>Thanks, David, this is useful. <br><br>So sometimes the front-end generates llvm.memcpy instead of store instructions.<br>Is there a rule in generating llvm.memcpy instructions instead of stores? I would have the same question for other instrinsics, such as memset and memmove.<br></div></div></div><div><div><div><br></div><div>Thanks, <br>Simona<br></div></div></div></div><div><div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Feb 11, 2016 at 5:24 PM, David Blaikie <span dir="ltr"><<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><div><div>On Thu, Feb 11, 2016 at 7:25 AM, Simona Simona via cfe-users <span dir="ltr"><<a href="mailto:cfe-users@lists.llvm.org" target="_blank">cfe-users@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div>Hi,<br><br>I'm using clang 3.4 to generate the bitcode of a C source file. <br>The source file is the following:<br></div><br><div style="margin-left:40px">typedef struct __attribute__ ((__packed__)) { float x, y; } myType;<br>myType make_float2(float x, float y) { myType f = { x, y }; return f; }<br><br>int main(int argc, char* argv[])<br>{<br> myType myVar[5];<br><br> for(int i=0;i<5;i++)<br> myVar[i] = make_float2(i,i);<br><br> return(myVar[1].x);<br>}<br></div><br></div>The bitcode is generated using the following command:<br></div><div style="margin-left:40px">clang -c -emit-llvm -O0 -fno-vectorize -fno-slp-vectorize -fno-lax-vector-conversions main.c -o main.bc<br></div><div><br><div><div><div><div style="margin-left:40px">target triple = "x86_64-unknown-linux-gnu"<br><br>%struct.myType = type <{ float, float }><br><br>; Function Attrs: nounwind uwtable<br>define <2 x float> @_Z11make_float2ff(float %x, float %y) #0 {<br>entry:<br> %retval = alloca %struct.myType, align 1<br> %x1 = getelementptr inbounds %struct.myType* %retval, i32 0, i32 0<br> store float %x, float* %x1, align 1<br> %y2 = getelementptr inbounds %struct.myType* %retval, i32 0, i32 1<br> store float %y, float* %y2, align 1<br> %0 = bitcast %struct.myType* %retval to <2 x float>*<br> %1 = load <2 x float>* %0, align 1<br> ret <2 x float> %1<br>}<br><br>; Function Attrs: nounwind uwtable<br>define i32 @main(i32 %argc, i8** %argv) #0 {<br>entry:<br> %myVar = alloca [100 x %struct.myType], align 16<br></div></div></div></div></div></div></blockquote><div><br></div></div></div><div>Looks like your IR corresponds to an array of length 100, not 5 as in your source, but that's not too important</div><span><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div><div><div style="margin-left:40px"> <b> %ref.tmp = alloca %struct.myType, align 1</b><br> br label %for.cond<br><br>for.cond: ; preds = %for.inc, %entry<br> %i.0 = phi i32 [ 0, %entry ], [ %inc, %for.inc ]<br> %cmp = icmp slt i32 %i.0, 5<br> br i1 %cmp, label %for.body, label %for.end<br><br>for.body: ; preds = %for.cond<br> %idxprom = sext i32 %i.0 to i64<br> %arrayidx = getelementptr inbounds [100 x %struct.myType]* %myVar, i32 0, i64 %idxprom<br> %conv = sitofp i32 %i.0 to float<br> %conv1 = sitofp i32 %i.0 to float<br> <b> %call = call <2 x float> @_Z11make_float2ff(float %conv, float %conv1)</b><br><b> %0 = bitcast %struct.myType* %ref.tmp to <2 x float>*</b><br><b> store <2 x float> %call, <2 x float>* %0, align 1</b><br> %1 = bitcast %struct.myType* %arrayidx to i8*<br> %2 = bitcast %struct.myType* %ref.tmp to i8*<br> call void @llvm.memcpy.p0i8.p0i8.i64(i8* %1, i8* %2, i64 8, i32 1, i1 false)<br></div></div></div></div></div></div></blockquote><div><br></div></span><div>Here is the store ^ into your array (%1 is the destination, a bitcast of %arrayidx, which is the pointer into your array at index %idxprom, which is %i.0, etc) using the memcpy intrinsic, rather than a store instruction.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span><div dir="ltr"><div><div><div><div><div style="margin-left:40px"> br label %for.inc<br><br>for.inc: ; preds = %for.body<br> %inc = add nsw i32 %i.0, 1<br> br label %for.cond<br><br>for.end: ; preds = %for.cond<br> %arrayidx2 = getelementptr inbounds [100 x %struct.myType]* %myVar, i32 0, i64 1<br> %x = getelementptr inbounds %struct.myType* %arrayidx2, i32 0, i32 0<br> %3 = load float* %x, align 1<br> %conv3 = fptosi float %3 to i32<br> ret i32 %conv3<br>}<br></div><br></div><div>Looking at the C source code there should be 5 store instructions corresponding to the 5 assignments of myVar[0], myVar[1], myVar[2], myVar[3] and myVar[4].<br>When I look at the bitcode however, I see 5 instances of <b>store <2 x float> %call, <2 x float>* %0, align 1 </b>which correspond to 5 stores at the same address <br>of %0 (which is actually %ref.tmp defined as <b>%ref.tmp = alloca %struct.myType, align 1</b>). <br><br>I would appreciate it if anyone could let me know how the 5 memory accesses at the 5 <b>different</b> memory addresses are implemented in the bitcode.<br><br></div><div>Thanks,<br></div><div>Simona<br></div><div><br></div></div></div></div></div>
<br></span>_______________________________________________<br>
cfe-users mailing list<br>
<a href="mailto:cfe-users@lists.llvm.org" target="_blank">cfe-users@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-users" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-users</a><br>
<br></blockquote></div><br></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>