<div dir="ltr">Hi Roman,<div class="gmail_extra"><br><div class="gmail_quote">On Tue, May 31, 2016 at 9:57 AM, Roman Gareev <span dir="ltr"><<a href="mailto:gareevroman@gmail.com" target="_blank">gareevroman@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Hi Tobias,<br>
<br>
I think that we could split a patch that contains an implementation of<br>
tiling, interchanging and unrolling of specific loops into three<br>
separate patches:<br>
<br>
1. The first one adds a class that describes a processor model. It<br>
also adds a new command line parameter that contains all necessary<br>
parameters of a target architecture, which are used to construct<br>
objects of the class.<br></blockquote><div>Instead of creating a new class, may be we could enhance some classes in TargetTransformInfo.h of LLVM to achieve your goal? Or this is done in step 3?</div><div><br></div><div>Thanks</div><div>Hongbin</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
<br>
2. The second one adds methods to the class to compute parameters for<br>
instantiations of the matrix-matrix multiplication. It also implements<br>
tiling, interchanging and unrolling of specific loops.<br>
<br>
3. The third one replaces manual passing of parameters of a target<br>
architecture with utilization of information from LLVM.<br>
<br>
What do you think about it?<br>
<br>
P.S.: I’m not sure whether all necessary parameters of a target<br>
architecture are accessible from LLVM and how it’s better to get them<br>
in our case. Should we ask these questions on the mailing list now?<br>
<br>
If I’m not mistaken, we’re interested in the following parameters:<br>
<br>
1. Size of double-precision floating-point number.<br>
<br>
2. Number of double-precision floating-point numbers that can be hold<br>
by a vector register.<br>
<br>
3. Throughput of vector instructions per clock cycle.<br>
<br>
4. Latency of instructions (i.e., the minimum number of cycles between<br>
the issuance of two dependent consecutive instructions).<br>
<br>
5. Paramaters of cache levels (size of cache lines, associativity<br>
degrees, sizes).<br>
<span class=""><font color="#888888"><br>
--<br>
Cheers, Roman Gareev.<br>
<br>
--<br>
You received this message because you are subscribed to the Google Groups "Polly Development" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="mailto:polly-dev%2Bunsubscribe@googlegroups.com">polly-dev+unsubscribe@googlegroups.com</a>.<br>
For more options, visit <a href="https://groups.google.com/d/optout" rel="noreferrer" target="_blank">https://groups.google.com/d/optout</a>.<br>
</font></span></blockquote></div><br></div></div>