<div dir="ltr"><div>You can call DataLayout::isLegalInteger() to verify the type you're shrinking to is reasonable (InstCombine already does this, e.g. InstCombiner::shouldChangeType()).</div><div>This isn't quite the same thing as querying TLI, but...</div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Mar 7, 2017 at 9:36 AM, Friedman, Eli <span dir="ltr"><<a href="mailto:efriedma@codeaurora.org" target="_blank">efriedma@codeaurora.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

  <div bgcolor="#FFFFFF"><div><div class="gmail-h5">

    <div class="gmail-m_-3688459371949455432moz-cite-prefix">On 3/7/2017 9:15 AM, Wei Mi wrote:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr"><br>

        <div class="gmail_extra"><br>

          <div class="gmail_quote">On Mon, Mar 6, 2017 at 8:50 PM, Wei

            Mi <span dir="ltr"><<a href="mailto:wmi@google.com" target="_blank">wmi@google.com</a>></span>

            wrote:<br>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

              <div dir="ltr"><br>

                <div class="gmail_extra"><br>

                  <div class="gmail_quote"><span>On Mon, Mar 6,

                      2017 at 4:36 PM, Friedman, Eli <span dir="ltr"><<a href="mailto:efriedma@codeaurora.org" target="_blank">efriedma@codeaurora.org</a>></span>

                      wrote:<br>

                    </span><span>

                      <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="gmail-m_-3688459371949455432m_-8464126786741311821gmail-">On

                          3/6/2017 11:06 AM, Wei Mi wrote:<br>

                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                            After looking at the benchmark motivating

                            the test, I find it is much harder for

                            codegenprepare to catch all the shrinking

                            opportunities compared with instcombine,

                            because of two things:<br>

                            <br>

                            * loadpre may merge the original load in a

                            shrinking pattern with other load in a very

                            early position, and there maybe other stores

                            in between. The safety check for the

                            codegenprepare now simply scan

                            mayWriteToMemory insns between load and

                            store in the candidate shrinking pattern,

                            and it will fail because of those stores in

                            between.<br>

                          </blockquote>

                          <br>

                        </span>

                        It would be easy to make the check a lot more

                        powerful using MemorySSA; I guess you don't have

                        access to that in the passes where you're doing

                        the transform?</blockquote>

                      <div><br>

                      </div>

                    </span>

                    <div>Yes, if MemorySSA + alias query can be used, it

                      should solve the problem. I didn't find existing

                      usage of MemorySSA in instcombine or CGP so I am

                      not sure whether it can be used. </div>

                    <span>

                      <div> </div>

                      <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="gmail-m_-3688459371949455432m_-8464126786741311821gmail-"><br>

                          <br>

                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                            * we may see the following pattern. a.f1 is

                            a bitfield. Ideally, we should do the

                            shrinking for all the three stores. But

                            codegenprepare works in top-down order.

                            After the first store is shrinked, it is

                            hard for the second and the third to know

                            %bf.set is the same as the value loaded from

                            %0. If we do this in instcombine, it is much

                            easier because before load/store

                            elimination, %bf.set will be loaded from %0

                            respecitively in if.then9 and in if.end41.<br>

                          </blockquote>

                          <br>

                        </span>

                        I'm not following how it helps to do this in

                        instcombine?  Normally, EarlyCSE runs first.  (A

                        complete C testcase might be useful here.)</blockquote>

                      <div><br>

                      </div>

                    </span>

                    <div>It is not earlycse. It is also loadpre to cause

                      the load to be replaced by %bf.set. </div>

                    <div><br>

                    </div>

                    <div>

                      <div style="font-size:12.8px">Before global value

                        numbering:</div>

                      <span>

                        <div style="font-size:12.8px"><br class="gmail-m_-3688459371949455432m_-8464126786741311821gmail-Apple-interchange-newline">

                          entry:</div>

                        <div style="font-size:12.8px">  %0 = bitcast

                          %class.B* %this to i64*</div>

                        <div style="font-size:12.8px">  %bf.load = load

                          i64, i64* %0, align 8</div>

                        <div style="font-size:12.8px">

                          <div>  %dec = add i64 %bf.load, 65535</div>

                          <div>  %bf.value = and i64 %dec, 65535</div>

                          <div>  %bf.clear5 = and i64 %bf.load, -65536</div>

                          <div>  %bf.set = or i64 %bf.value, %bf.clear5</div>

                          <div>  store i64 %bf.set, i64* %0, align 8</div>

                        </div>

                        <div style="font-size:12.8px">  ...</div>

                        <div style="font-size:12.8px">if.then9:</div>

                      </span>

                      <div style="font-size:12.8px">  <span style="font-size:12.8px">%bf.load1 = load i64,

                          i64* %0, align 8                      //

                          %bf.load1 is helpful for the patch to

                          recognize the load-and-or-store pattern.</span></div>

                      <div style="font-size:12.8px">

                        <div>  %inc79 = add i64 %bf.load1,

                          281474976710656       </div>

                        <span>

                          <div>  %bf.shl = and i64 %inc79,

                            71776119061217280</div>

                        </span>

                        <div>  %bf.clear18 = and i64 %bf.load1,

                          -71776119061217281</div>

                        <span>

                          <div>  %bf.set19 = or i64 %bf.shl, %bf.clear18</div>

                          <div>  store i64 %bf.set19, i64* %0, align 8</div>

                        </span></div>

                      <div style="font-size:12.8px">  ...</div>

                      <div style="font-size:12.8px">if.end41:</div>

                      <div style="font-size:12.8px">  <span style="font-size:12.8px">%bf.load2 = load i64,

                          i64* %0, align 8                      </span><span style="font-size:12.8px">// %bf.load2 is

                          helpful for the patch to recognize the

                          load-and-or-store pattern.</span></div>

                      <div style="font-size:12.8px">

                        <div>  %inc4578 = add i64 %bf.load2, 65536      

                                          </div>

                        <span>

                          <div>  %bf.shl48 = and i64 %inc4578,

                            4294901760</div>

                        </span>

                        <div>  %bf.clear49 = and i64 %bf.load2,

                          -4294901761</div>

                        <span>

                          <div>  %bf.set50 = or i64 %bf.shl48,

                            %bf.clear49</div>

                          <div>  store i64 %bf.set50, i64* %0, align 8</div>

                        </span></div>

                    </div>

                    <div> </div>

                    <div>After global value numbering:</div>

                    <span>

                      <div><br>

                      </div>

                      <div>

                        <div style="font-size:12.8px">entry:</div>

                        <div style="font-size:12.8px">  %0 = bitcast

                          %class.B* %this to i64*</div>

                        <div style="font-size:12.8px">  %bf.load = load

                          i64, i64* %0, align 8</div>

                        <div style="font-size:12.8px">

                          <div>  %dec = add i64 %bf.load, 65535</div>

                          <div>  %bf.value = and i64 %dec, 65535</div>

                          <div>  %bf.clear5 = and i64 %bf.load, -65536</div>

                          <div>  %bf.set = or i64 %bf.value, %bf.clear5</div>

                          <div>  store i64 %bf.set, i64* %0, align 8</div>

                        </div>

                        <div style="font-size:12.8px">  ...</div>

                        <div style="font-size:12.8px">if.then9:</div>

                        <div style="font-size:12.8px">

                          <div>  %inc79 = add i64 %bf.set,

                            281474976710656       // we hope to know

                            %bf.set is the same as load i64, i64* %0,

                            align 8.</div>

                          <div>  %bf.shl = and i64 %inc79,

                            71776119061217280</div>

                          <div>  %bf.clear18 = and i64 %bf.set,

                            -71776119061217281</div>

                          <div>  %bf.set19 = or i64 %bf.shl, %bf.clear18</div>

                          <div>  store i64 %bf.set19, i64* %0, align 8</div>

                        </div>

                        <div style="font-size:12.8px">  ...</div>

                        <div style="font-size:12.8px">if.end41:</div>

                        <div style="font-size:12.8px">

                          <div>  %inc4578 = add i64 %bf.set, 65536      

                                             // we hope to know %bf.set

                            is the same as load i64, i64* %0, align 8.</div>

                          <div>  %bf.shl48 = and i64 %inc4578,

                            4294901760</div>

                          <div>  %bf.clear49 = and i64 %bf.set,

                            -4294901761</div>

                          <div>  %bf.set50 = or i64 %bf.shl48,

                            %bf.clear49</div>

                          <div>  store i64 %bf.set50, i64* %0, align 8</div>

                        </div>

                      </div>

                    </span></div>

                </div>

              </div>

            </blockquote>

          </div>

        </div>

      </div>

    </blockquote>

    <br></div></div>

    I don't like depending on pass ordering like this... I mean, I can't

    really say without the original C testcase, but at first glance,

    EarlyCSE should be catching this, and you're just getting lucky.<span class="gmail-"><br>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div> Or can I move all the types of shrinking to

              instcombine? Since I only do the shrinking to i8, i16,

              i32, I think almost all the general architectures support

              i8/i16/i32 load/store directly?</div>

          </div>

        </div>

      </div>

    </blockquote></span>

    It should be safe to assume every architecture has a reasonable

    lowering for i8, i16, and i32 load/store.<span class="gmail-HOEnZb"><font color="#888888"><br>

    <br>

    -Eli<br>

    <pre class="gmail-m_-3688459371949455432moz-signature" cols="72">-- 

Employee of Qualcomm Innovation Center, Inc.

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project</pre>

  </font></span></div>

</blockquote></div><br></div></div>