<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><br><div><div>On Mar 4, 2014, at 12:25 PM, Pierre-André Saulais <<a href="mailto:pierre-andre@codeplay.com">pierre-andre@codeplay.com</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite">
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
<div bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 04/03/14 18:08, Andrew Trick wrote:<br>
</div>
<blockquote cite="mid:16B451FC-7C1F-4C60-A690-88A327DE6667@apple.com" type="cite">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<br>
<div>
<div>On Mar 4, 2014, at 10:05 AM, Pete Cooper <<a moz-do-not-send="true" href="mailto:peter_cooper@apple.com">peter_cooper@apple.com</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<blockquote type="cite">
<div style="font-family: Helvetica; font-size: 12px;
font-style: normal; font-variant: normal; font-weight:
normal; letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px;">
<div><br class="Apple-interchange-newline">
On Mar 3, 2014, at 2:21 PM, Andrew Trick <<a moz-do-not-send="true" href="mailto:atrick@apple.com">atrick@apple.com</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<blockquote type="cite">
<div style="font-family: Helvetica; font-size: 12px;
font-style: normal; font-variant: normal; font-weight:
normal; letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px;">
<div><br class="Apple-interchange-newline">
On Mar 3, 2014, at 8:53 AM, Pierre-Andre Saulais <<a moz-do-not-send="true" href="mailto:pierre-andre@codeplay.com">pierre-andre@codeplay.com</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<blockquote type="cite">
<div style="font-size: 12px; font-style: normal;
font-variant: normal; font-weight: normal;
letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows:
auto; word-spacing: 0px; -webkit-text-stroke-width:
0px;">Hi Andrew,<br>
<br>
We are currently using a custom model where
scheduling information is attached to each
MCInstrDesc through tablegen, and we're trying to
move to one of LLVM's models.<br>
<br>
To expand on what JinGu mentioned, our target has
explicit ports that are used to read and write
values from and to the register file. The read port
is usually accessed on cycle 0 while the write port
is accessed when the result is written back to the
destination register. Let's assume ADD has a latency
of 1, MUL has a latency of 2 and both use port P0 to
write back their result. The two instructions below
would conflict on P0:<br>
<br>
MUL r3, r4, r5<br>
ADD r0, r1, r2<br>
NOP ; Both r0 and r4 are written back
using P0 - conflict.<br>
<br>
On our target there is no interlock which means any
conflict results in the wrong value being written
back to one of the register. That's why we want to
model these ports as resources in the new model.
That's also why we map these port resources to each
operand as each operand accesses a different port.<br>
<br>
After reading your replies, we have realized that
the scheduler does not need to know which operand
corresponds to each port. It simply needs to know
the set of ports used by each instruction and after
how many cycles these ports are used/reserved to
avoid any conflict. That's why I believe the new
process resource model closely fits what we need,
except for the per-resource delay you mentioned.<br>
<br>
This is how our model currently looks like:<br>
<br>
def :ItinRW<[1_LATENCY_WITH_P0,
0_LATENCY_WITH_P1, 0_LATENCY_WITH_P2], [II_ADD]>;<br>
def :ItinRW<[2_LATENCY_WITH_P0,
0_LATENCY_WITH_P1, 0_LATENCY_WITH_P2], [II_MUL]>;<br>
<br>
where n_LATENCY_WITH_p is defined roughly as:<br>
<br>
class n_LATENCY_WITH_p<int latency,
ProcResourceKind port> :
SchedWriteRes<[PR_Pp]> {<br>
let Latency = latency;<br>
let ResourceDelays = [latency];<br>
}<br>
<br>
class PR_Pp<int portIdx> :
ProcResource<1>;<br>
<br>
The latency for register write-back/port access is
static and without interlock, which I think means
the port resources should have 'Buffered = 0' in the
definition. Is that correct?<br>
</div>
</blockquote>
<div><br>
</div>
Yes, but it isn’t sufficient. The scheduler makes no
attempt to insert nops currently. However, at the very
least, you will want to implement your own
MachineSchedStrategy. It would be natural to handle nop
insertion within your implementation.</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
Thanks, I'll have a look at MachineSchedStrategy and see how we can
implement it for our target.<br>
<blockquote cite="mid:16B451FC-7C1F-4C60-A690-88A327DE6667@apple.com" type="cite">
<div>
<blockquote type="cite">
<div style="font-family: Helvetica; font-size: 12px;
font-style: normal; font-variant: normal; font-weight:
normal; letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px;">Nop
insertion during scheduling sounds good to me, but nop
insertion after regalloc has the advantage of being able to
insert nops for spill/reload. Unless you don’t have spills?</div>
</blockquote>
<div><br>
</div>
<div>To elaborate a bit more, MachineScheduler can run both
preRA and postRA. So, if you want to do nop insertion within
MachineScheduler (as opposed to a separate pass) you could
enable it only during postRA scheduling.</div>
<div><br>
</div>
<div>-Andy</div>
</div>
</blockquote>
We are currently doing scheduling with a custom
scheduler/packetization pass at the very end of the machine
compilation process, before object generation. We aren't using the
MachineScheduler or any other scheduling before that pass, neither
preRA or postRA. I think we could benefit from at least preRA
scheduling, so that the register live ranges seen by the RA better
match the final ranges (after our scheduling pass).<br></div></blockquote><div><br></div>I see, then you’re really can do whatever you want with the machine model. You’re just limited by the tables produced by the current tablegen backend and whatever features you add to it. You just need to understand the format of the tables in <YourTarget>GenSubtargetInfo.inc and try to fit your model into that format. Hopefully adding ResourceDelays will give you enough flexibility.</div><div><br><blockquote type="cite"><div bgcolor="#FFFFFF" text="#000000">
We do have spills, but I'm not sure if there is a benefit to
inserting nops before our final pass though. Would that improve
register allocation, if it was possible to do so preRA?<br></div></blockquote><div><br></div>Nope. The main reason to bundle pre-RA is to expose more opportunity for code motion (without physical register dependencies), thus generate tighter bundles. In general, there’s no reason to introduce nops other than to meet your encoding constraints, so it’s just a matter of picking the most convenient place to do that.</div><div><br></div><div>-Andy</div><div><br><blockquote type="cite"><div bgcolor="#FFFFFF" text="#000000">
<blockquote cite="mid:16B451FC-7C1F-4C60-A690-88A327DE6667@apple.com" type="cite">
<div><br>
<blockquote type="cite">
<div style="font-family: Helvetica; font-size: 12px;
font-style: normal; font-variant: normal; font-weight:
normal; letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px;">Pete<br>
<blockquote type="cite">
<div style="font-family: Helvetica; font-size: 12px;
font-style: normal; font-variant: normal; font-weight:
normal; letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br>
</div>
<div style="font-family: Helvetica; font-size: 12px;
font-style: normal; font-variant: normal; font-weight:
normal; letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px;">In
fact, the interpretation of most machine model
properties (MircoOpBufferSize, resource BufferSize,
ResourceCycles, ResourceDelay) is handled within the
MachineSchedStrategy. In past emails I have been
explaining how the GenericScheduler interprets the
model, but it is really up to your custom strategy to
implement the model.</div>
<div style="font-family: Helvetica; font-size: 12px;
font-style: normal; font-variant: normal; font-weight:
normal; letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br>
<blockquote type="cite">
<div style="font-size: 12px; font-style: normal;
font-variant: normal; font-weight: normal;
letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows:
auto; word-spacing: 0px; -webkit-text-stroke-width:
0px;">I have attached a patch that adds the
'ResourceDelays' field in tablegen. Could you have a
look at it? A couple possible issues are:<br>
- 'Delay' is signed, since 'Cycles' in
MCWriteLatencyEntry is also signed.<br>
</div>
</blockquote>
<div><br>
</div>
Sure.</div>
<div style="font-family: Helvetica; font-size: 12px;
font-style: normal; font-variant: normal; font-weight:
normal; letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br>
<blockquote type="cite">
<div style="font-size: 12px; font-style: normal;
font-variant: normal; font-weight: normal;
letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows:
auto; word-spacing: 0px; -webkit-text-stroke-width:
0px;">- When an instruction accesses the same
resource multiple times, the uses are aggregated in
SubtargetEmitter::GenSchedClassTables. I'm not sure
how that would work if we add a 'Delay' field to
MCWriteProcResEntry.<br>
</div>
</blockquote>
<div><br>
</div>
<div>Me neither. I suggest adding an assert to make sure
no one accidentally uses two resources with non-zero
delay. Otherwise, your patch looks fine to me. It’s
totally up to you to test it though. I really want to
take this patch, but we have no mechanism for testing
out-of-tree target features.</div>
</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
Adding an assert when someone uses two resources with non-zero
delay, or maybe two different delays, sounds good to me. I'm glad to
hear that you'd want to take patches even for out-of-tree features,
it's much appreciated.<br>
<br>
I'm not very familiar with the itinerary model, but aren't these two
ways of expressing schedules equivalent?<br>
<br>
def :ItinRW<[1_LATENCY_WITH_P0, 0_LATENCY_WITH_P1,
0_LATENCY_WITH_P2], [II_ADD]>;<br>
<br>
InstrItinData<II_ADD, [InstrStage<1, [P1], 0>,
InstrStage<1, [P2]>, InstrStage<1, [P0]>], [1, 0, 0]><br>
<br>
If that's the case, then does the new machine model express
itineraries with more than one stage, without adding this
'ResourceDelays' field? <br>
<br>
Thanks,<br>
Pierre<br>
<br>
<blockquote cite="mid:16B451FC-7C1F-4C60-A690-88A327DE6667@apple.com" type="cite">
<div>
<blockquote type="cite">
<div style="font-family: Helvetica; font-size: 12px;
font-style: normal; font-variant: normal; font-weight:
normal; letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px;">
<blockquote type="cite">
<div style="font-family: Helvetica; font-size: 12px;
font-style: normal; font-variant: normal; font-weight:
normal; letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px;">
<div><br>
</div>
<div>-Andy</div>
<br>
<blockquote type="cite">
<div style="font-size: 12px; font-style: normal;
font-variant: normal; font-weight: normal;
letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows:
auto; word-spacing: 0px; -webkit-text-stroke-width:
0px;"><br>
Thanks,<br>
Pierre<br>
<br>
On 28/02/14 01:00, Andrew Trick wrote:<br>
<blockquote type="cite">On Feb 19, 2014, at 1:54 PM,
jingu <<a moz-do-not-send="true" href="mailto:jingu@codeplay.com">jingu@codeplay.com</a>>
wrote:<br>
<br>
<blockquote type="cite">Hi Andy,<br>
<br>
I am trying to schedule and packetize
instructions for VLIW at post-RA<br>
stage or final codegen stage, where code
transformations are not allowed<br>
any more, because hardware can not resolve
resource conflict. There is a<br>
simple example as following:<br>
<br>
ADD dest_reg1, src_reg1, src_reg2 (functional
unit : ALU)<br>
STORE dest_reg2, mem (functional unit:
LOAD_STORE)<br>
<br>
These instructions can be genally packetized
together because there is<br>
no dependency among operands and they use
different functional unit. But<br>
we have one more restricton. The restriction is
that some of<br>
instructions can not access to same register
file at the same cycle. In<br>
other words, if 'src_reg1' of ADD instruction
uses register file 'A' and<br>
'dest_reg2' of STORE instruction uses same
register file at the same<br>
cycle, it causes resource conflict and can not
be executed on same<br>
cycle. This restriction depends on instruction
type. I tried to consider<br>
each register file as a resource unit which is
consumed by each operand.<br>
While scheduling instructions per cycle, used
register file is recorded<br>
on state per cycle to check the conflict. In our
heristic, it depends on<br>
operand's latency to record this resource on
specific cycle's state. so<br>
I have tried to find a way to get latency and
resource with each<br>
operand. If it is not possible to support this
feature with per-operand<br>
resource model, as you suggested, I will try to
make our own state<br>
machine or other scheduling constraint logic. I
am newbee with<br>
scheduler. If you have any kinds of comment or
feel something worng,<br>
please let me know. It will be really helpful.<br>
</blockquote>
It sounds like the register file is static and
does not depend on register allocation. In this
case, what you tried makes sense but is really not
supported. The machine model tables are designed
to be efficient for the common case, and
per-operand resources don’t really make sense most
of the time.<br>
<br>
It sounds like you want to model the pipeline
stage at which a resource is used. To do that with
the per-operand machine model (misnomer), I think
we need a ResourceDelay vector in addition to
ResourceCycles, which we could easily add.<br>
<br>
However, overall, I think you’re target is
interesting enough that you may be better off
augmenting the standard machine model with your
own model. Your scheduler plugin could keep your
own tables or state machine to model the
constraints.<br>
<br>
If you want to be clever, you could write tablegen
code to build your model up from the
SchedRead/Write definitions that are part of the
standard model. You could add extra fields
specific to your model.<br>
<br>
Were you previously using the old instruction
itineraries, and now moving to the new model?<br>
<br>
-Andy<br>
<br>
<blockquote type="cite">Thanks for your kind
response,<br>
JinGu Kang<br>
<br>
On 2014-02-20 오전 2:27, Andrew Trick wrote:<br>
<blockquote type="cite">Hi JinGu,<br>
<br>
We currently have the ResourceCycles list to
indicate the number of cpu cycles during which
a resource is reserved. We could simply add a
ResourceDelay with similar grammar. The
MachineScheduler could be taught to keep track
of the first and last time that a resource is
reserved.<br>
<br>
Note that the MachineScheduler will work with
the instruction itineraries if you choose to
implement them. That’s the only way to get a
full reservation table without customizing the
scheduler. You can plugin your own state
machine or other scheduling constraint logic.
You may want to do this if you have very
complicated constraints.<br>
<br>
Can you provide an example of the most
complicated instruction resources that you
need to model?<br>
<br>
-Andy<br>
<br>
On Feb 19, 2014, at 4:57 AM, JinGu Kang <<a moz-do-not-send="true" href="mailto:jingu@codeplay.com">jingu@codeplay.com</a>>
wrote:<br>
<br>
<blockquote type="cite">Hi Andy,<br>
<br>
I am sorry to misunderstand 'ReadAdvance'
code. In order to support<br>
resource per operand, I feel we need more
table and function. If<br>
possbile, I would like to listen to your
opinion whether this feature is<br>
useful or not. As I mentioned on previous
e-mail, it will be useful to<br>
access the latency and the resource per
operand while checking resource<br>
conflict per cycle.<br>
<br>
Thanks,<br>
JinGu Kang<br>
<br>
On 18/02/14 23:09, jingu wrote:<br>
<blockquote type="cite">
<blockquote type="cite">Resources and
latency are not tied. An instruction is
mapped to a<br>
scheduling class. A scheduling class is
mapped to a set of resources<br>
and a per-operand list of latencies.<br>
</blockquote>
Thanks for your kind explanation.<br>
<br>
Our heuristic algorithm have needed the
latency and the resource per<br>
operand to check resource conflicts per
cycle. In order to support<br>
this with LLVM, I expected a per-operand
list of resources like<br>
latencies with a scheduling class.<br>
<br>
Can I ask you something to modify on
tablegen? I think that the<br>
'WriteResourceID' field of
'MCWriteLatencyEntry' is for identifying<br>
the WriteResources of each defintion as
commented on code. As you<br>
know, tablegen sets the 'WriteResourceID'
field of<br>
'MCWriteLatencyEntry' with 'WriteID' when
the 'Write' of defition is<br>
referenced by a 'ReadAdvance'. If we
always set this field with<br>
'WriteID', it causes problem? I can see
that 'ReadAdvance' only uses<br>
the 'WriteResourceID' field of
'MCWriteLatencyEntry' in<br>
'computeOperandLatency' function. I think
the pair of latency and<br>
write resource for defintion will be
useful to check conflicts of<br>
resources. As reference, I have attached
simple patch.<br>
<br>
Thanks,<br>
JinGu Kang<br>
<br>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
<br>
<br>
--<span class="Apple-converted-space"> </span><br>
Pierre-Andre Saulais<br>
Compiler Developer<br>
Codeplay Software Ltd<br>
45 York Place, Edinburgh, EH1 3HP<br>
Tel: 0131 466 0503<br>
Fax: 0131 557 6600<br>
Website:<span class="Apple-converted-space"> </span><a moz-do-not-send="true" href="http://www.codeplay.com/">http://www.codeplay.com</a><br>
Twitter:<span class="Apple-converted-space"> </span><a moz-do-not-send="true" href="https://twitter.com/codeplaysoft">https://twitter.com/codeplaysoft</a><br>
<br>
This email and any attachments may contain
confidential and /or privileged information and is
for use by the addressee only. If you are not the
intended recipient, please notify Codeplay Software
Ltd immediately and delete the message from your
computer. You may not copy or forward it,or use or
disclose its contents to any other person. Any views
or other information in this message which do not
relate to our business are not authorized by
Codeplay software Ltd, nor does this message form
part of any contract unless so stated.<br>
As internet communications are capable of data
corruption Codeplay Software Ltd does not accept any
responsibility for any changes made to this message
after it was sent. Please note that Codeplay
Software Ltd does not accept any liability or
responsibility for viruses and it is your
responsibility to scan any attachments.<br>
Company registered in England and Wales, number:
04567874<br>
Registered office: 81 Linkfield Street, Redhill RH1
6BY<br>
<br>
<span><add_resource_delays.patch></span></div>
</blockquote>
</div>
<br style="font-family: Helvetica; font-size: 12px;
font-style: normal; font-variant: normal; font-weight:
normal; letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px;">
<span style="font-family: Helvetica; font-size: 12px;
font-style: normal; font-variant: normal; font-weight:
normal; letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px;
float: none; display: inline !important;">_______________________________________________</span><br style="font-family: Helvetica; font-size: 12px;
font-style: normal; font-variant: normal; font-weight:
normal; letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px;">
<span style="font-family: Helvetica; font-size: 12px;
font-style: normal; font-variant: normal; font-weight:
normal; letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px;
float: none; display: inline !important;">LLVM
Developers mailing list</span><br style="font-family:
Helvetica; font-size: 12px; font-style: normal;
font-variant: normal; font-weight: normal;
letter-spacing: normal; line-height: normal; orphans:
auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px;">
<span style="font-family: Helvetica; font-size: 12px;
font-style: normal; font-variant: normal; font-weight:
normal; letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px;
float: none; display: inline !important;"><a moz-do-not-send="true" href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a><span class="Apple-converted-space"> </span> <a moz-do-not-send="true" href="http://llvm.cs.uiuc.edu/">http://llvm.cs.uiuc.edu</a></span><br style="font-family: Helvetica; font-size: 12px;
font-style: normal; font-variant: normal; font-weight:
normal; letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px;">
<span style="font-family: Helvetica; font-size: 12px;
font-style: normal; font-variant: normal; font-weight:
normal; letter-spacing: normal; line-height: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-stroke-width: 0px;
float: none; display: inline !important;"><a moz-do-not-send="true" href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a></span></blockquote>
</div>
</blockquote>
</div>
<br>
</blockquote>
<br>
</div>
</blockquote></div><br></body></html>