<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On 2 Jan 2017, at 22:52, Mehdi Amini <<a href="mailto:mehdi.amini@apple.com" class="">mehdi.amini@apple.com</a>> wrote:</div><div class=""><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br class=""></div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">Since the LangRef for addrspacecast says " if the address space conversion is legal then both result and operand refer to the same memory location”, the optimizer is free to assume that a load to @G through either addrspace 0 or 1 yields the same value, and thus can figure that the alloca value is the same as @G. </div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">Since bar() is read-only, exchanging a pointer for another one in the same address space, when the pointers point to the same value, seems valid to me.</div></div></blockquote><div><br class=""></div><div>OK, that makes sense, thanks for the explanation.</div><div><br class=""></div><div>With that in mind, I *think* these two new tests are all that is needed to expose the bug that I’m trying to fix (they both fail with 3.9). I’ve got them passing with a patch to 4.0 so I’ll try and post something for review soon.</div><div><br class=""></div><div> define void @test10(i64 %index, i32* %out) {<br class=""> %A = alloca %U, align 16<br class=""> %a = bitcast %U* %A to i8*<br class=""> call void @llvm.memcpy.p0i8.p1i8.i64(i8* %a, i8 addrspace(1)* bitcast ([2 x %U] addrspace(1)* @I to i8 addrspace(1)*), i64 20, i32 16, i1 false)<br class=""> %tmp1 = bitcast i8* %a to i32*<br class=""> %tmp2 = getelementptr i32, i32* %tmp1, i64 %index<br class=""> %tmp3 = load i32, i32* %tmp2<br class=""> store i32 %tmp3, i32* %out<br class=""> ; CHECK-LABEL: @test10(<br class=""> ; CHECK-NOT: alloca<br class=""> ; CHECK-NOT: memcpy<br class=""> ; CHECK-NOT: addrspacecast<br class=""> ; Ensure that load still happens through addrspace(1)<br class=""> ; CHECK: load i32, i32 addrspace(1)*<br class=""> ret void<br class=""> }<br class=""><br class=""> define void @test12() {<br class=""> %A = alloca %U, align 16<br class=""> %a = bitcast %U* %A to i8*<br class=""> call void @llvm.memcpy.p0i8.p1i8.i64(i8* %a, i8 addrspace(1)* bitcast ([2 x %U] addrspace(1)* @I to i8 addrspace(1)*), i64 20, i32 16, i1 false)<br class=""> call void @bar(i8* %a) readonly<br class=""> ; Must retain memcpy as source address space doesn't match function argument.<br class=""> ; CHECK-LABEL: @test12(<br class=""> ; CHECK: llvm.memcpy<br class=""> ; CHECK: bar<br class=""> ret void<br class=""> }</div><br class=""><div>James</div><br class=""><blockquote type="cite" class=""><div class=""><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">— </div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">Mehdi</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br class=""></div><br class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><blockquote type="cite" class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;"><div class=""><div class="" style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><div class=""><br class=""></div><div class="">I can (and have) fixed this as part of the patch I’m working on, but since this implies that a couple of existing regression tests would be incorrect I just wanted to double-check that I’m not misinterpreting something.</div><div class=""><br class=""></div><div class="">CCing Matt Arsenault who added those specific tests originally in r207054.</div><div class=""><br class=""></div><div class="">James</div><div class=""><br class=""></div><div class=""><blockquote type="cite" class=""><div class="">On 9 Nov 2016, at 21:11, Mehdi Amini <<a href="mailto:mehdi.amini@apple.com" class="">mehdi.amini@apple.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><span class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;">Hi,</span><br class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><blockquote type="cite" class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">On Nov 9, 2016, at 10:13 AM, James Price via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> wrote:<br class=""><br class="">Hi,<br class=""><br class="">I’ve recently encountered an issue where the `instcombine` pass replaces an `llvm.memcpy` between two distinct address spaces with an `addrspacecast` instruction.<br class=""><br class="">As an example, see the trivial OpenCL kernel attached. I’m compiling like this:<br class=""><br class=""> clang -cc1 -triple spir64-unknown-unknown -x cl -O0 -emit-llvm array_init.cl -o before.ll<br class=""><br class="">This yields an `llvm.memcpy` to copy the array initialiser data from the global variable (in `addrspace(2)`) to the `alloca` result (in `addrspace(0)`).<br class=""><br class="">I then apply the `instcombine` pass via:<br class=""><br class=""> opt -S -instcombine before.ll -o after.ll<br class=""><br class="">This results in the memcpy being nuked, and the `addrspace(2)` data is now accessed directly via an `addrspacecast` to `addrspace(0)`.<br class=""><br class=""><br class="">It seems to me that this sort of optimisation is only valid if it is guaranteed that the two address spaces alias for the given target triple (which for SPIR, they do not).<br class=""></blockquote><br class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><span class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;">I’m not sure “alias” is the right consideration here.</span><br class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><blockquote type="cite" class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br class=""><br class="">This particular optimisation is coming from lines ~290-300 of InstCombineLoadStoreAlloca.cpp, although I suspect this isn’t the only case where this might happen.<br class=""><br class="">Adding a check to only perform this replacement if the two address spaces are equal fixes the issue for me, but this is probably too conservative since many targets with flat address spaces will probably benefit from this optimisation.<br class="">It feels like passes should query the target about whether two address spaces alias before introducing an `addrspacecast`, but I’m not familiar enough with LLVM internals to know if this is information that is easy to make available (if it isn’t already).<br class=""><br class="">Is there something we can do here to avoid this sort of optimisation causing problems for targets with segmented address spaces?<br class=""></blockquote><br class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><span class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;">This is a bug, we can’t assume any memory layout I believe.</span><br class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><span class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;">The memcpy is supposed to be equivalent to a sequence of load and store. Here we are just failing to keep the property that the load is performed through addrspace(2). The fix is not “trivial” though, the transformation can only be performed if the address of the alloca does not escape, and some rewriting of the uses is needed to propagate the address space through GEPs for instance.</span><br class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><span class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;">—<span class="Apple-converted-space"> </span></span><br class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><span class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;">Mehdi</span></div></blockquote></div></div></div></blockquote></div></blockquote></div><br class=""></body></html>