<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">On 6/26/20 1:58 AM, 林政宗 wrote:<br>
</div>
<blockquote type="cite" cite="mid:8aa594c.13e9.172ef6c1b4c.Coremail.jackie_linzz@126.com">
<div style="line-height:1.7;color:#000000;font-size:14px;font-family:Arial">
<div style="margin: 0;">Hi,</div>
<div style="margin: 0;"><br>
</div>
<div style="margin: 0;">I am planning to expanding the pseudo
instructions in
XXXTargetLowering::EmitInstrWithCustomInserter(), and use
temporary virtual registers as operands.</div>
<div style="margin: 0;">If I use virtual registers, do I need to
mark them as "early clobber"?</div>
</div>
</blockquote>
<p><br>
</p>
<p>If I have an instruction XYZ, and it takes an input register VI,
and an output register VO, such that the instruction:</p>
<p> VO = XYZ VI</p>
<p>reads VI and computes VO, and if the value in VI is no longer
needed after this instruction (or was undef in the first place),
then the register allocator might assign the same physical
register to both VI and VO. You might end up with:</p>
<p>RA = XYZ RA.</p>
<p>If XYZ is really a pseudo instruction, this might not be
acceptable. You might need two distinct registers just because of
how the expansion works. For example, maybe this expands to:</p>
VO = OP1 VI<br>
VO = OP2 VO, VI<br>
<p>note that, in this case, the expansion needs VI in two different
places. If VO and VI are assigned to be the same register, the
expansion just won't work correctly. In this case, you need
earlyclobber on your pseudo-instruction.<br>
</p>
<p><br>
</p>
<blockquote type="cite" cite="mid:8aa594c.13e9.172ef6c1b4c.Coremail.jackie_linzz@126.com">
<div style="line-height:1.7;color:#000000;font-size:14px;font-family:Arial">
<div style="margin: 0;">I saw that sometimes they marked virtual
register as "early clobber" in <span style="display: inline
!important; float: none; background-color: rgb(255, 255,
255); color: rgb(0, 0, 0); font-family: Arial; font-size:
14px; font-style: normal; font-variant: normal; font-weight:
400; letter-spacing: normal; orphans: 2; text-align: left;
text-decoration: none; text-indent: 0px; text-transform:
none; -webkit-text-stroke-width: 0px; white-space: normal;
word-spacing: 0px;">EmitInstrWithCustomInserter() in </span>MIPS
backend.</div>
<div style="margin: 0;">What is the effect of marking a virtual
register as "early clobber" before RA?</div>
</div>
</blockquote>
<p><br>
</p>
<p>I don't recall any effect.</p>
<p> -Hal<br>
</p>
<p><br>
</p>
<blockquote type="cite" cite="mid:8aa594c.13e9.172ef6c1b4c.Coremail.jackie_linzz@126.com">
<div style="line-height:1.7;color:#000000;font-size:14px;font-family:Arial">
<p style="margin: 0;"><br>
</p>
<div style="margin: 0;">Thanks,</div>
<div style="margin: 0;">Jerry</div>
<p style="margin: 0;"><br>
</p>
<p style="margin: 0;"><br>
</p>
<p style="margin: 0;"><br>
</p>
<p>在 2020-06-25 20:29:30,"Hal Finkel" <a class="moz-txt-link-rfc2396E" href="mailto:hfinkel@anl.gov"><hfinkel@anl.gov></a>
写道:</p>
<blockquote id="isReplyContent" style="PADDING-LEFT: 1ex;
MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">
<div class="moz-cite-prefix">On 6/25/20 1:11 AM, 林政宗 via
llvm-dev wrote:<br>
</div>
<blockquote cite="mid:28a315d6.11eb.172ea1a6f4f.Coremail.jackie_linzz@126.com" type="cite">
<div style="line-height:1.7;color:#000000;font-size:14px;font-family:Arial">
<div style="line-height:1.7;color:#000000;font-size:14px;font-family:Arial">
<div style="margin:0;">Hi, there</div>
<div style="margin:0;">I am writing an backend, and I
met a problem.</div>
<div style="margin:0;">We don't have load/store
instructions for vector predicate registers(vpr for
short). </div>
<div style="margin:0;">The hardware has 64 vector
registers(vr for short) and 8 vector predicate
registers. And there is no move instructions between
vr and vpr.</div>
<div style="margin:0;">vr supports many operations, and
vpr supports vpror, vprxor, vprand and vprinv
operations.</div>
<div style="margin:0;"> A vr has 512 bits, and a vpr has
128 bits. vr is used for v16i32, v32i16, v64i8. And a
scalar register has 32 bits.</div>
<div style="margin:0;">If we compare or add two v16i32,
a element in vpr has 8 bits. If we compare or add two
v64i8, then a element in vpr has 2 bits(one bit for
compare flag and one bit for carry flag). </div>
<div style="margin:0;">A element in vpr contains carry
flag and compare flag.</div>
<div style="margin:0;"> We have defined registers and a
new type(vpr) for vector predicate registers in
backend.</div>
<div style="margin:0;">Although there is no direct
instruction to move vpr to vr or to move vr to vpr,
there is a method to work around this. And we have
load/store instructions for vr.</div>
<div style="margin:0;">move vpr to vr for v32i16 (from
vpr0 to vr1):</div>
<div style="margin:0;">1 vclr vr0 // clear vr0</div>
<div style="margin:0;">2 ldi r5, 0x00010001 //
load immediate (compare bit mask for v32i16) to scalar
register r5</div>
<div style="margin:0;">3 movr2vr.dup vr2, r5 //
duplicate content in r5 into vr2, </div>
<div style="margin:0;">4 vadd.t.s16 vr1, vr0, vr2,
vpr0 //vector add if element compare bit is set,
element type is 16 bit signed integer, now we have
moved compare bits from vpr0 to vr1</div>
<div style="margin:0;">5 ldi r5, 0x00020002 //
load immediate (carry bit mask for v32i16) to scalar
register r5</div>
<div style="margin:0;">6 movr2vr.dup vr2, r5 //
duplicate content in r5 into vr2</div>
<div style="margin:0;">7 vadd.c.s16 vr1, vr1, vr2,
vpr0 // vr1 = vr1 + vr2, vector add if element carry
bit is set, element type is 16 bit signed integer, now
we moved carry bits from vpr0 to vr1 too.</div>
<div style="margin:0;"><br>
</div>
<div style="margin:0;">mov vr to vpr for v32i16 (from
vr1 to vpr0):</div>
<div style="margin:0;">8 vclr vr0 // clear vr0</div>
<div style="margin:0;">9 ldi r5, 0x00010001 //
load immediate (compare bit mask for v32i16) to r5</div>
<div style="margin:0;">10 movr2vr.dup vr2, r5 //
duplicate content of r5 into vr2</div>
<div style="margin:0;">11 vand.u16 vr2, vr1, vr2 //
vector and, element type is 16 bit unsigned integer,
vr2 = vr1 & vr2, now we have moved compare bits
from vr1 to vr2 now</div>
<div style="margin:0;">12 vslt.s16 vpr0, vr0, vr2
// vector set when less than, element type is 16 bit
signed integer, now we have moved compare bits from
vr1 to vpr0</div>
<div style="margin:0;">13 ldi r5, 0x00020002 // load
immediate (carry bit mask for v32i16) to r5</div>
<div style="margin:0;">14 movr2vr.dup vr2, r5 //
duplicate content of r5 into vr2</div>
<div style="margin:0;">15 vand.u16 vr2, vr1, vr2 //
vector and for element type 16 bit unsigned integer,
vr2 has carry bits now</div>
<div style="margin:0;">16 ldi r5, 0x7FFF7FFF // max
number for 16 bit signed integer</div>
<div style="margin:0;">17 movr2vr.dup vr3, r5 //
duplicate r5 into vr3</div>
<div style="margin:0;">18 vadd.s16 vr1, vr2, vr3,
vpr0 // vpr0 has carry bits set now</div>
<div style="margin:0;"><br>
</div>
<div style="margin:0;">Each vector type has a different
instruction sequence, because the bit mask and element
type is different.</div>
<div style="margin:0;">I have tried to lower load/store
for vpr in XXXISelLowering.cpp. But there is no
guarantee that line 12 and line 18 would assign the
same register for vpr0. vpr0 in line18 is an output
and is not an input.</div>
<div style="margin:0;">And vpr0 in line 12 and line 18
is parallel in SelectionDAG graph. They are both
output.</div>
<div style="margin:0;">I think I would try to define
three pseudo instructions for three vector type, and
expand the pseudo instruction into instruction
sequence before register allocation at next step. But
I'm not sure it will work.</div>
<div style="margin:0;">What should I do? <br>
</div>
</div>
</div>
</blockquote>
<p><br>
</p>
<p>This somewhat depends on how you're modeling things, but a
late-expanded pseud-instructions seems like a workable
approach. If the pseudo-instruction needs temporary
registers (and it looks like it does), then the
pseudo-instruction should take them as register operands (so
that RA will allocate them for you and you don't need to
worry about scavenging them later). You might, however, need
to mark such operands as "early clobber" to prevent RA from
assigning the same register as an input and output
(sometimes, depending on how the expanded code uses the
registers, this is necessary).</p>
<p> -Hal<br>
</p>
<p><br>
</p>
<blockquote cite="mid:28a315d6.11eb.172ea1a6f4f.Coremail.jackie_linzz@126.com" type="cite">
<div style="line-height:1.7;color:#000000;font-size:14px;font-family:Arial">
<div style="line-height:1.7;color:#000000;font-size:14px;font-family:Arial">
<div style="margin:0;"><br>
</div>
<div style="margin:0;">Thanks and best regards,</div>
<div style="margin:0;">Jerry</div>
<div style="margin:0;"><br>
</div>
<div style="margin:0;"><br>
</div>
<div style="margin:0;"><br>
</div>
<div style="margin:0;"><br>
</div>
<div style="margin:0;"><br>
</div>
<div style="margin:0;"> </div>
<div style="margin:0;"><br>
</div>
</div>
</div>
<br>
<br>
<span title="neteasefooter">
<p> </p>
</span><br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
LLVM Developers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org" moz-do-not-send="true">llvm-dev@lists.llvm.org</a>
<a class="moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
</blockquote>
<pre class="moz-signature" cols="72">--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory</pre>
</blockquote>
</div>
<br>
<br>
<span title="neteasefooter">
<p> </p>
</span>
</blockquote>
<pre class="moz-signature" cols="72">--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory</pre>
</body>
</html>