<html><head><meta http-equiv="Content-Type" content="text/html charset=iso-8859-1"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">On Jul 1, 2013, at 11:52 AM, Eli Friedman <<a href="mailto:eli.friedman@gmail.com">eli.friedman@gmail.com</a>> wrote:<br><div><br class="Apple-interchange-newline"><blockquote type="cite"><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><div dir="ltr">On Mon, Jul 1, 2013 at 11:30 AM, Quentin Colombet<span class="Apple-converted-space"> </span><span dir="ltr"><<a href="mailto:qcolombet@apple.com" target="_blank">qcolombet@apple.com</a>></span><span class="Apple-converted-space"> </span>wrote:<br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; position: static; z-index: auto;"><div style="word-wrap: break-word;">Hi,<div><br></div><div>** Problematic **</div><div>I am looking for advices to share some logic between DAG combine and target lowering.</div><div><br></div><div>Basically, I need to know if a bitcast that is about to be inserted during target specific isel lowering will be eliminated during DAG combine.</div><div><br></div><div>Let me know if there is another, better supported, approach for this kind of problems.</div><div><br></div><div>** Motivating Example **</div><div>The motivating example comes form the lowering of vector code on armv7.</div><div>More specifically, the build_vector node is lowered to a target specific ARMISD::build_vector where all the parameters are bitcasted to floating point types.</div><div><br></div><div>This works well, unless the inserted bitcasts survive until instruction selection. In that case, they incur moves between integer unit and floating point unit that may result in inefficient code.</div><div><br></div><div>Attached motivating_example.ll shows such a case:</div><div>llc -O3 -mtriple thumbv7-apple-ios3 motivating_example.ll -o -</div><div><div style="margin: 0px; font-size: 11px; font-family: Menlo;"><span style="white-space: pre-wrap;">        </span>ldr<span style="white-space: pre-wrap;">   </span>r0, [r1]</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;"><span style="white-space: pre-wrap;">   </span>ldr<span style="white-space: pre-wrap;">   </span>r1, [r2]</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;"><span style="white-space: pre-wrap;">   </span>vmov<span style="white-space: pre-wrap;">  </span>s1, r1</div><div style="margin: 0px; font-size: 11px; font-family: Menlo;"><span style="white-space: pre-wrap;">     </span>vmov<span style="white-space: pre-wrap;">  </span>s0, r0</div></div><div style="margin: 0px;">Here each ldr, vmov sequences could have been replaced by a simple vld1.32.</div><div><br></div><div>** Proposed Solution **</div><div>Lower to more vector friendly code (using a sequence of insert_vector_elt), when bit casts will not be free.</div><div>The attached patch demonstrates that, but is missing the proper check to know what DAG combine will do (see TODO).</div><div></div></div></blockquote></div></div><div class="gmail_extra"><br></div><div class="gmail_extra">I think you're approaching this backwards: the obvious thing to do is to generate the insert_vector_elt sequence unconditionally, and DAGCombine that sequence to a build_vector when appropriate.</div></div></div></blockquote><div dir="auto">Hi Eli,</div><div dir="auto"><br></div><div dir="auto">I have started to look into the direction you gave me.</div><div dir="auto"><br></div><div dir="auto">I may have miss something but I do not see how the proposed direction solves the issue. Indeed to be able to DAGCombine a insert_vector_elt sequences into a ARMISD::build_vector, I still need to know if it would be profitable, i.e., if DAGCombine will remove the bitcasts that combining/lowering is about to insert.</div><div dir="auto"><br></div><div dir="auto">Since target specific DAGCombine are also done in TargetLowering I do not have access to more DAGCombine logic (at least DAGCombineInfo is not providing the require information).</div><div dir="auto"><br></div><div dir="auto">What did I miss?</div><div dir="auto"><br></div><div dir="auto">Thanks,</div><div dir="auto"><br></div><div dir="auto">-Quentin</div><br><blockquote type="cite"><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><div dir="ltr"><div class="gmail_extra"><br></div><div class="gmail_extra">-Eli</div></div></div></blockquote></div><br></body></html>