<div dir="ltr"><div dir="ltr"><div dir="ltr"><br><br><div class="gmail_quote"><div dir="ltr">On Wed, Sep 26, 2018 at 9:42 AM Cameron McInally <<a href="mailto:cameron.mcinally@nyu.edu" target="_blank">cameron.mcinally@nyu.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">On Wed, Sep 26, 2018 at 9:32 AM Sanjay Patel <<a href="mailto:spatel@rotateright.com" target="_blank">spatel@rotateright.com</a>> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><br><br><div class="gmail_quote"><div dir="ltr">On Tue, Sep 25, 2018 at 7:47 PM Cameron McInally <<a href="mailto:cameron.mcinally@nyu.edu" target="_blank">cameron.mcinally@nyu.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><br><div class="gmail_quote"><div><div>This is the first time I'm looking at foldShuffledBinop(...), so maybe a naive question, but why not do similar shuffle canonicalizations on unary (or ternary) operations? That may be a better fix in the long run.</div></div></div></div></div></div></div></blockquote><div><br></div><div>AFAIK, all of the math/logic folding that we do currently is on binary operators because all of the instructions have that form:</div><div><a href="http://llvm.org/docs/LangRef.html#instruction-reference" target="_blank">http://llvm.org/docs/LangRef.html#instruction-reference</a><br></div><div><br></div><div>As discussed, we fake the unary neg/not/fneg as binops. Excluding control-flow, the only unary instructions are casts, and I don't see any ternary or higher math ops other than intrinsics.<br></div></div></div></div></blockquote><div><br></div><div>Digressing a bit...</div><div><br></div><div>That sounds like a bug, not a feature. Casts/converts, rounds, abs, other libm functions, fmas, compares, and probably more can be masked. If intermixed shuffles are preventing combines on those, intrinsics or not, that isn't ideal.</div></div></div></div></blockquote><div><br></div><div>Casts and compares have plenty of specialized transforms, so I think we have that covered. Similarly, we have libm and intrinsic transforms split between LibCallSimplifier and InstCombiner::visitCallInst(). So we could do better to organize things, but there probably aren't too many holes in those optimizations (or at least I haven't seen them yet).</div><div><br></div><div>To bring it back to the question of fneg, let me know if this is an accurate summary: <br></div><div>1. fneg would be nice to have for clarity, but it doesn't make optimization in the default LLVM FP environment any easier/better. <br></div><div>2. We will have to do some preliminary work in the IR optimizer to avoid regressions if we add fneg to the IR.<br></div><div>3. We want fneg as a 1st class instruction even though the related fabs/copysign bitstring ops are intrinsics (because fneg is more common than the others?).<br></div><div>4. Adding fneg to IR means we do not need to add a constrained intrinsic for fneg (likewise, there's no need for constrained fabs/copysign because those intrinsics already exist).<br></div><div><br></div><div><br></div><div> </div></div></div></div></div>