<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<br>
<div class="moz-cite-prefix">On 03/25/2014 02:31 PM, Jingyue Wu
wrote:<br>
</div>
<blockquote
cite="mid:CAMROOrF-M6i37_M4o0Anx-+4gf+d6pynEnRqHqiEyFPG_hMxcQ@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div dir="ltr">
<div>This is a follow-up discussion on <a
moz-do-not-send="true"
href="http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20140324/101899.html"
target="_blank">http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20140324/101899.html</a>.
The front-end change was already pushed in r204677, so we
want to continue with the IR optimization. <br>
</div>
<div><br>
</div>
<div>In general, we want to write an IR pass to convert
generic address space usage to non-generic address space
usage, because accessing the generic address space in CUDA
and OpenCL is significantly slower than accessing
non-generic ones (such as shared and constant),. </div>
<div><br>
</div>
<div>Here is an example Justin gave: </div>
<div><br>
</div>
<div><span
style="font-family:arial,sans-serif;font-size:13px">
%ptr = ...</span><br
style="font-family:arial,sans-serif;font-size:13px">
<span style="font-family:arial,sans-serif;font-size:13px">
%val = load i32* %ptr</span><br
style="font-family:arial,sans-serif;font-size:13px">
<br style="font-family:arial,sans-serif;font-size:13px">
<span style="font-family:arial,sans-serif;font-size:13px">In
this case, %ptr is a generic address space pointer
(assuming an address space mapping where 0 is generic).
But if an analysis can prove that the pointer %ptr was
originally addrspacecast'd from a specific address space
(or some other mechanism through which the pointer's
specific address space can be determined), it may be
beneficial to explicitly convert the IR to something
like:</span><br
style="font-family:arial,sans-serif;font-size:13px">
<br style="font-family:arial,sans-serif;font-size:13px">
<span style="font-family:arial,sans-serif;font-size:13px">
%ptr = ...</span><br
style="font-family:arial,sans-serif;font-size:13px">
<span style="font-family:arial,sans-serif;font-size:13px">
%ptr.0 = addrspacecast i32* to i32 addrspace(3)*</span><br
style="font-family:arial,sans-serif;font-size:13px">
<span style="font-family:arial,sans-serif;font-size:13px">
%val = load i32 addrspace(3)* %ptr.0</span><br
style="font-family:arial,sans-serif;font-size:13px">
<br style="font-family:arial,sans-serif;font-size:13px">
<span style="font-family:arial,sans-serif;font-size:13px">Such
a translation may generate better code for some targets.</span></div>
</div>
</div>
</div>
</blockquote>
Just a note of caution: for some of us, address spaces are
semantically important. (i.e. having a cast introduced from one to
another would be incorrect) I have no problem with the mechanism
you're describing being implemented, but it needs to be an opt in
feature.<br>
<br>
<blockquote
cite="mid:CAMROOrF-M6i37_M4o0Anx-+4gf+d6pynEnRqHqiEyFPG_hMxcQ@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div dir="ltr">
<div><br>
</div>
<div>There are two major design decisions we need to make: </div>
<div><br>
</div>
<div>1. Where does this pass live? Target-independent or
target-dependent?</div>
<div><br>
</div>
<div>Both NVPTX and R600 backend want this optimization,
which seems a good justification for making this
optimization target-independent. </div>
<div><br>
</div>
<div>However, we have three concerns on this:</div>
<div>a) I doubt this optimization is valid for all targets,
because LLVM language reference (<a moz-do-not-send="true"
href="http://llvm.org/docs/LangRef.html#addrspacecast-to-instruction"
target="_blank">http://llvm.org/docs/LangRef.html#addrspacecast-to-instruction</a>)
says addrspacecast "can be a no-op cast or a complex value
modification, depending on the target and the address
space pair." </div>
<div>b) NVPTX and R600 have different address numbering for
the generic address space, which makes things more
complicated. </div>
<div>c) We don't have a good understanding of the R600
backend. </div>
<div><br>
</div>
<div>
Therefore, I would vote for making this optimization
NVPTX-specific for now. If other targets need this, we can
later think about how to reuse the code. <br>
</div>
</div>
</div>
</div>
</blockquote>
No opinion, but if it is target independent, it needs to be behind
an optin target hook. <br>
<blockquote
cite="mid:CAMROOrF-M6i37_M4o0Anx-+4gf+d6pynEnRqHqiEyFPG_hMxcQ@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div dir="ltr">
<div><br>
</div>
<div>2. How effective do we want this optimization to be? </div>
<div><br>
</div>
<div>In the short term, I want it to be able to eliminate
unnecessary non-generic-to-generic addrspacecasts the
front-end generates for the NVPTX target. For example, <br>
</div>
<div><br>
</div>
<div>%p1 = addrspace i32 addrspace(3)* %p0 to i32*</div>
<div>%v = load i32* %p1</div>
<div><br>
</div>
<div>=></div>
<div><br>
</div>
<div>%v = load i32 addrspace(3)* %p0</div>
<div><br>
</div>
<div>We want similar optimization for store+addrspacecast
and gep+addrspacecast as well. </div>
<div><br>
</div>
<div>In a long term, we could for sure improve this
optimization to handle more instructions and more
patterns. <br>
</div>
</div>
</div>
</div>
</blockquote>
Just to note, this last bit raises much less worries for me about
correctness of my work. If you've loading from a pointer which was
in different address space, it seems very logical to combine that
with the load. We'd also never generate code like that. :)<br>
<br>
To restate my concern in general terms, it's the introduction of
*new* casts which worry me, not the exploitation/optimization of
existing ones. <br>
<br>
<font color="#888888">Philip</font><br>
</body>
</html>