<div dir="ltr"><div>I'm guessing nobody has started implementing any of the suggested rotate functionality since there are still open questions, but let me know if I'm wrong.</div><div><br></div><div>We're still getting patches that try to work around the current limitations (<a href="https://reviews.llvm.org/D48705" rel="noreferrer" target="_blank"> https://reviews.llvm.org/<wbr>D48705</a> ), so we should move forward since we've approximated/justified the cost and benefits.<br></div><div class="gmail_extra"><br></div><div class="gmail_extra">Let's settle on the intrinsic definition(s).<br></div><div class="gmail_extra"><br></div><div class="gmail_extra">1. There was a suggestion to generalize rotate to a "valign" or "double shift" (that's what x86 means with its poorly worded "double precision shift"). How does that work with vector types? The options are a full vector-size shift or a per-element shift. If it's the full vector, do we still want/need a specialized rotate intrinsic for per-element? If it's per-element, do we still want/need the other form for a full vector?</div><div class="gmail_extra"><br></div><div class="gmail_extra">2. What is the behavior for a shift/rotate amount that is equal or greater than the bit-width of the operand (or the bit width of a vector element type?)? Can we modulo that operand by the bit width, or does that not map well to the hardware semantics? <br></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, May 17, 2018 at 5:23 PM, John Regehr <span dir="ltr"><<a href="mailto:regehr@cs.utah.edu" target="_blank">regehr@cs.utah.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Thanks Sanjay!<br>

<br>

At this point the cost/benefit tradeoff for rotate intrinsics seems pretty good.<br>

<br>

John<span><br>

<br>

<br>

On 05/17/2018 11:14 AM, Sanjay Patel wrote:<br>

</span><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>

A rotate intrinsic should be relatively close in cost/complexity to the existing bswap.<br>

<br>

A grep of intrinsic::bswap says we'd probably add code in:<br>

InstCombine<br>

InstructionSimplify<br>

ConstantFolding<br>

DemandedBits<br>

ValueTracking<br>

VectorUtils<br>

SelectionDAGBuilder<br>

<br>

But I don't think it's fair to view those additions as pure added cost. As an example, consider that we have to add hacks to EarlyCSE to recognize multi-IR-instruction min/max/abs patterns. Intrinsics just work as-is there. So if you search for 'matchSelectPattern', you get an idea (I see 32 hits in 10 files) of the cost of *not* having intrinsics for those operations that we've decided are not worthy of intrinsics.<br>

<br>

<br></span><span>

On Wed, May 16, 2018 at 2:20 PM, John Regehr via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a> <mailto:<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.or<wbr>g</a>>> wrote:<br>

<br>

    On 5/16/18 1:58 PM, Sanjay Patel via llvm-dev wrote:<br>

<br>

        An informal metric might be: if the operation is supported as a<br>

        primitive op or built-in in source languages and it is supported<br>

        as a single target instruction, can we guarantee that 1-to-1<br>

        translation through optimization?<br>

<br>

<br>

    It seems perfectly reasonable for LLVM users to expect this to<br>

    happen reliably.<br>

<br>

    I'd like to take a look at the other side of the equation: the cost<br>

    of adding a new intrinsic in terms of teaching passes to see through<br>

    it, so we don't lose optimizations that worked before the intrinsic<br>

    was added.<br>

<br>

    For example, clearly ValueTracking needs a few lines added so that<br>

    computeKnownBits and friends don't get stopped by a rotate. Anyone<br>

    have a reasonably complete list of files that need similar changes?<br>

<br>

    John<br>

<br>

    ______________________________<wbr>_________________<br>

    LLVM Developers mailing list<br></span>

    <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a> <mailto:<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.or<wbr>g</a>><br>

    <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

    <<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin<wbr>/mailman/listinfo/llvm-dev</a>><br>

<br>

<br>

</blockquote>

</blockquote></div><br></div></div>