<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class="">Hi Eli,</div><div class=""><br class=""></div><div class="">Thanks again for your reply. </div><div class=""><br class=""></div><div class="">I am unsure about implementing the getCrossCopyRegClass for my target. My target does not support or allow moves to and from the SR. The SR exists because it has implicit involvement in some instructions, but it is opaque to the assembler and to the user as a register. I mean, there are no instructions to directly move or read it, or even access it directly. So I think that implementing the getCrossCopyRegClass -to tell LLVM that it can be copied to another register class- is against the actual target features (unless I am missing the actual purpose of the function)</div><div class=""><br class=""></div><div class="">I ended implementing target Branch and Select instructions that have a glue input, and another pair that have a GR register as input. Both map to the same machine instructions through Pattern matching in TargetInstrInfo.td. This is what I mean, for the case of the Conditional Branch instruction :</div><div class=""><br class=""></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Monaco; background-color: rgb(255, 255, 255);" class=""><div style="margin: 0px; line-height: normal;" class="">let Uses = [SR] in {</div><div style="margin: 0px; line-height: normal;" class=""> def BRCC : T2ccbr< <span style="color: #d12f1b" class="">"br"</span>, <span style="color: #d12f1b" class="">"br"</span>> ;</div><div style="margin: 0px; line-height: normal;" class="">}</div><div style="margin: 0px; font-size: 12px; line-height: normal; font-family: Helvetica; min-height: 14px;" class=""><br class=""></div><div style="margin: 0px; line-height: normal;" class="">def : Pat<(CPU74brcc_g bb:$a, imm:$cc), (BRCC jmptarget:$a, imm:$cc, <span style="color: #272ad8" class="">0</span>)>;</div><div style="margin: 0px; line-height: normal;" class="">def : Pat<(CPU74brcc_i bb:$a, imm:$cc, GR16:$ins), (BRCC jmptarget:$a, imm:$cc, GR16:$ins)>;</div></div><div class=""> </div><div class="">Then, in MyTargetISelDAGToDAG I create ‘CPU74brcc_g' instructions if they are glued to a ‘compare’ instruction, or ‘CPU74brcc_i' instructions if they link to the result of an arithmetic instruction. During instruction selection these are replaced by actual, physical, BRCC instructions. This solution has proved to work fine without having to explicitly involve the SR. Of course, the SR is still implicitly involved by the correct definition of ‘Uses’ and ‘Defs’ on the corresponding instructions. </div><div class=""><br class=""></div><div class="">Joan Lluch</div><div class=""><br class=""></div><div class=""><br class=""></div><div><blockquote type="cite" class=""><div class="">On 3 Jun 2019, at 21:00, Eli Friedman <<a href="mailto:efriedma@quicinc.com" class="">efriedma@quicinc.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div class="WordSection1" style="page: WordSection1; font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class="">Have you implemented TargetRegisterInfo::getCrossCopyRegClass?<o:p class=""></o:p></div><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class="">-Eli<o:p class=""></o:p></div><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div><div style="border-style: none none none solid; border-left-width: 1.5pt; border-left-color: blue; padding: 0in 0in 0in 4pt;" class=""><div class=""><div style="border-style: solid none none; border-top-width: 1pt; border-top-color: rgb(225, 225, 225); padding: 3pt 0in 0in;" class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><b class="">From:</b><span class="Apple-converted-space"> </span>Joan Lluch <<a href="mailto:joan.lluch@icloud.com" class="">joan.lluch@icloud.com</a>><span class="Apple-converted-space"> </span><br class=""><b class="">Sent:</b><span class="Apple-converted-space"> </span>Sunday, June 2, 2019 2:39 AM<br class=""><b class="">To:</b><span class="Apple-converted-space"> </span>Eli Friedman <<a href="mailto:efriedma@quicinc.com" class="">efriedma@quicinc.com</a>><br class=""><b class="">Cc:</b><span class="Apple-converted-space"> </span>llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>><br class=""><b class="">Subject:</b><span class="Apple-converted-space"> </span>[EXT] Re: [llvm-dev] Optimizing Compare instruction selection<o:p class=""></o:p></div></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class="">Hi Eli,<o:p class=""></o:p></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class="">Thank you very much for your response. <o:p class=""></o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class="">In fact, I had already tried the X86 approach before, i.e explicitly using the status register. This is the approach that appeals more to me. I left it parked because it also produced some problems (but I left it commented out). So I have now re-lived the code, and it works fine in most cases, but there’s a particular case that causes LLVM to stop with an assertion. Apparently LLVM tries to use the SR (Status Register) register as a general purpose one, which is not allowed in my architecture. The problem exists even before I added the referred optimisations. The code that causes the problem is this:<o:p class=""></o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div></div><div class=""><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 8.5pt; font-family: Monaco, serif; color: rgb(186, 45, 162);" class="">int</span><span style="font-size: 8.5pt; font-family: Monaco, serif;" class=""><span class="Apple-converted-space"> </span>doSmth(<span class="Apple-converted-space"> </span><span style="color: rgb(186, 45, 162);" class="">int</span><span class="Apple-converted-space"> </span>y );<o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><o:p class=""> </o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 8.5pt; font-family: Monaco, serif; color: rgb(186, 45, 162);" class="">int</span><span style="font-size: 8.5pt; font-family: Monaco, serif;" class=""><span class="Apple-converted-space"> </span>test( <span style="color: rgb(186, 45, 162);" class="">int</span><span class="Apple-converted-space"> </span>y )<o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 8.5pt; font-family: Monaco, serif;" class="">{<o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 8.5pt; font-family: Monaco, serif;" class=""> <span class="Apple-converted-space"> </span><span style="color: rgb(186, 45, 162);" class="">int</span><span class="Apple-converted-space"> </span>neg = y <<span class="Apple-converted-space"> </span><span style="color: rgb(39, 42, 216);" class="">0</span>;<o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 8.5pt; font-family: Monaco, serif;" class=""> <span class="Apple-converted-space"> </span><span style="color: rgb(186, 45, 162);" class="">if</span><span class="Apple-converted-space"> </span>( neg )<o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 8.5pt; font-family: Monaco, serif;" class=""> y = - y;<o:p class=""></o:p></span></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white; min-height: 15px;" class=""><span style="font-size: 8.5pt; font-family: Monaco, serif;" class=""> <o:p class=""></o:p></span></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 8.5pt; font-family: Monaco, serif;" class=""> <span class="Apple-converted-space"> </span><span style="color: rgb(186, 45, 162);" class="">int</span><span class="Apple-converted-space"> </span>rv = doSmth( y );<o:p class=""></o:p></span></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white; min-height: 15px;" class=""><span style="font-size: 8.5pt; font-family: Monaco, serif;" class=""> <o:p class=""></o:p></span></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 8.5pt; font-family: Monaco, serif;" class=""> <span class="Apple-converted-space"> </span><span style="color: rgb(186, 45, 162);" class="">return</span><span class="Apple-converted-space"> </span>neg ? - rv : rv;<o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 8.5pt; font-family: Monaco, serif;" class="">}<o:p class=""></o:p></span></div></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class="">Apparently, LLVM attempts to physically use the result of a CMP instruction through a function call by storing it on a temporary register. This is found before the doSmth function call, <o:p class=""></o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div></div><div class=""><div class=""><div class=""><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><b class=""><span style="font-size: 8.5pt; font-family: Menlo, serif;" class=""> t30: i16 = CMPkr16 t4, TargetConstant:i16<0></span></b><span style="font-size: 8.5pt; font-family: Menlo, serif;" class=""><o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><b class=""><span style="font-size: 8.5pt; font-family: Menlo, serif;" class=""> t36: ch,glue = CopyToReg t0, Register:i16 $sr, t30</span></b><span style="font-size: 8.5pt; font-family: Menlo, serif;" class=""><o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><b class=""><span style="font-size: 8.5pt; font-family: Menlo, serif;" class=""> t32: i16 = NEGSETCC TargetConstant:i16<4>, t36:1</span></b><span style="font-size: 8.5pt; font-family: Menlo, serif;" class=""><o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><b class=""><span style="font-size: 8.5pt; font-family: Menlo, serif;" class=""><br class=""><br class=""></span></b><span style="font-size: 8.5pt; font-family: Menlo, serif;" class=""><o:p class=""></o:p></span></div></div></div></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class="">And this is generated after the call<o:p class=""></o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><b class=""><span style="font-size: 8.5pt; font-family: Menlo, serif;" class=""> t35: ch,glue = CopyToReg t0, Register:i16 $sr, t30</span></b><span style="font-size: 8.5pt; font-family: Menlo, serif;" class=""><o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><b class=""><span style="font-size: 8.5pt; font-family: Menlo, serif;" class=""> t31: i16 = SELCC t19, t18, TargetConstant:i16<4>, t35:1</span></b><span style="font-size: 8.5pt; font-family: Menlo, serif;" class=""><o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><b class=""><span style="font-size: 8.5pt; font-family: Menlo, serif;" class=""> t21: ch,glue = CopyToReg t18:1, Register:i16 $r0, t31</span></b><span style="font-size: 8.5pt; font-family: Menlo, serif;" class=""><o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><b class=""><span style="font-size: 8.5pt; font-family: Menlo, serif;" class=""> </span></b><span style="font-size: 8.5pt; font-family: Menlo, serif;" class=""><o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class="">NEGSETCC and SELCC are genuine instructions of my target architecture, they use the SR along with operands to produce a result.<o:p class=""></o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class="">As you can see ‘t30’ is used both before and after the function call, which is what I think that causes trouble. In particular, the assertion that I get is this:<o:p class=""></o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div></div><div class=""><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><b class=""><span style="font-size: 8.5pt; font-family: Menlo, serif;" class="">Assertion failed: (RegClass->isAllocatable() && "Virtual register RegClass must be allocatable."), function createVirtualRegister, file /Users/joan/LLVM+CLANG/llvm-7.0.1.src/lib/CodeGen/MachineRegisterInfo.cpp, line 170.</span></b><span style="font-size: 8.5pt; font-family: Menlo, serif;" class=""><o:p class=""></o:p></span></div></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class="">If I remove the ‘isAllocatable=0’ setting on the SR register, then LLVM will try to spill the SR by calling ‘storeRegToStackSlot’, which then I stop with my own assertion because the SR can’t be spilled.<o:p class=""></o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class="">If I then remove also the ‘let CopCost = -1’ for the SR register, then LLVM attempts to move it to a general purpose register by calling ‘copyPhysRegs’, which again I stop with an assertion because it’s not legal in my architecture.<o:p class=""></o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class="">So, I’m stuck as well with this approach.<o:p class=""></o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class="">Any ideas, on what I can try?<o:p class=""></o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class="">I think, that at this point I need to stop LLVM from attempting to reuse the result of the CMP instruction before and after the function call, and instead make it generate a new cmp instruction after the call. But how do I achieve that?<o:p class=""></o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class="">(maybe a difficulty is that I am Custom lowering ’’SELECT_CC’ and ‘BR_CC’, and Expanding ’SELECT’ and ‘BRCOND’, instead of the opposite? Does this make a difference?)<o:p class=""></o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class="">Thanks<o:p class=""></o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div><div class=""><div class=""><div class=""><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><span style="font-size: 9pt; font-family: Helvetica, sans-serif;" class="">Joan Lluch<br class=""><br class="">Tel: 620 28 45 13<o:p class=""></o:p></span></div></div></div></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div><div class=""><blockquote style="margin-top: 5pt; margin-bottom: 5pt;" class=""><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class="">On 2 Jun 2019, at 03:23, Eli Friedman <<a href="mailto:efriedma@quicinc.com" style="color: purple; text-decoration: underline;" class="">efriedma@quicinc.com</a>> wrote:<o:p class=""></o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><o:p class=""> </o:p></div><div class=""><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">There are basically two possible approaches here.<o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class=""><o:p class=""> </o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">One approach is to just wait until after isel to handle this: check for an addition instruction immediately followed by an cmp; then erase the cmp. See, for example,<span class="pl-en"> ARMBaseInstrInfo::optimizeCompareInstr</span>.<o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class=""><o:p class=""> </o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">Another, you make a dedicated target-specific ISel node that produces two results: essentially, one is the result of the arithmetic, and the other is the status register. See, for example, X86TargetLowering::LowerBRCOND.<o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class=""><o:p class=""> </o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">I don't think you'd want to try to do anything in Select(); I mean, you could try to pattern-match (cmp x, (add x, y)) or whatever, but you'd probably run into trouble if the add has multiple uses.<o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class=""><o:p class=""> </o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">-Eli<o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class=""><o:p class=""> </o:p></span></div></div><div class=""><div class="MsoNormal" align="center" style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; text-align: center; background-color: white;"><span style="font-size: 12pt;" class=""><hr size="2" width="1072" align="center" style="width: 804.1pt;" class=""></span></div><div id="divRplyFwdMsg" class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><b class="">From:</b><span class="apple-converted-space"> </span>llvm-dev <<a href="mailto:llvm-dev-bounces@lists.llvm.org" style="color: purple; text-decoration: underline;" class="">llvm-dev-bounces@lists.llvm.org</a>> on behalf of Joan Lluch via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" style="color: purple; text-decoration: underline;" class="">llvm-dev@lists.llvm.org</a>><br class=""><b class="">Sent:</b><span class="apple-converted-space"> </span>Saturday, June 1, 2019 10:09 AM<br class=""><b class="">To:</b><span class="apple-converted-space"> </span>llvm-dev<br class=""><b class="">Subject:</b><span class="apple-converted-space"> </span>[EXT] [llvm-dev] Optimizing Compare instruction selection<span style="font-size: 12pt;" class=""><o:p class=""></o:p></span></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class=""> <o:p class=""></o:p></span></div></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">I attempt to optimize the use of the ‘CMP’ instruction on my architecture by removing the instruction instances where the Status Register already had the correct status flags.<o:p class=""></o:p></span></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class=""><o:p class=""> </o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">The cmp instruction in my architecture is the typical one that compares two registers, or a register with an immediate, and sets the Status Flags accordingly. I implemented my ‘cmp’ instruction in LLVM by custom lowering the SETCC, SELECT_CC and BR_CC in the usual way, mostly following the ARM implementation. In particular the target cmp instruction is created and glued to a target conditional instruction, such as a conditional branch, or a conditional select. <o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class=""><o:p class=""> </o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">The generated code may look like this:<o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class=""><o:p class=""> </o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">add R0, R1, R2<o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">cmp R0, #0<o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">breq Label<o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class=""><o:p class=""> </o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">This works fine, but now I would want to go one step further:<o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class=""><o:p class=""> </o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">As part of the SETCC and SELECT_CC lowering, I can determine if the conditions are met in my architecture to avoid generating a ‘cmp’ instruction at all. For example, if the left operand of the SETCC is an ‘addition', the right operand is constant zero, and the condition is 'ISD::CondCode::SETEQ, I can safely assume that the Status Register in my architecture already has the right condition just before the ‘cmp’, which has been set by the target ‘add’ instruction. So at this point I want to avoid generating the ‘cmp’ because it is redundant. <o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class=""><o:p class=""> </o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">My goal is to replace the assembly code shown above, by just this:<o:p class=""></o:p></span></div></div><div class=""><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class=""><o:p class=""> </o:p></span></div></div><div class=""><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">add R0, R1, R2<o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">breq Label<o:p class=""></o:p></span></div></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class=""><o:p class=""> </o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">The ‘cmp’ was redundant in this case because the add instruction already set the Z (zero) flag<o:p class=""></o:p></span></div></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class=""><o:p class=""> </o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">I tried several things, but I got stuck in some way in all of them. For example, an approach that I tried is to have a ‘dummy_cmp’ instruction, that I generate instead of the regular ‘cmp’ when the comparison is redundant. The ‘dummy_cmp’ must be removed at a latter stage, ideally during the selDAGToDAG phase, but I failed to do so. <o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class=""><o:p class=""> </o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">- So my first question is. How do I remove the ‘dummy_cmp’ node in the MyTargetISelDAGtoDAG::Select() function?<o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class=""><o:p class=""> </o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">Another approach that I tried, which I regard as preferable, is to directly avoid the creation of a ‘cmp’ instruction during ISelLowering. Instead of ‘cmp’, use the LHS operand of the SETCC operand if it mets the conditions described above. The problem that I found with this approach is that the ‘cmp’ is just a ‘glue’ to the target branch or select instruction, however, the LHS is an actual operand (for example an ‘add’ node), so I need to create a ‘glue’ for it, in order to be attached to the target branch instruction.<o:p class=""></o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class=""><o:p class=""> </o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class="">- My second question is therefore, how do I create a ‘glue’ for an existing ‘add’ node that I can attach to the target conditional instruction as a replacement of the ‘cmp’ instruction? <o:p class=""></o:p></span></div></div><div class=""><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class=""><o:p class=""> </o:p></span></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 12pt;" class=""> <o:p class=""></o:p></span></div><div class=""><div class=""><div class=""><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; background-color: white;" class=""><span style="font-size: 9pt; font-family: Helvetica, sans-serif;" class="">Joan Lluch</span></div></div></div></div></div></div></div></div></div></div></blockquote></div></div></div></div></div></blockquote></div><br class=""></body></html>