<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    HI, Chandler:<br>
    <br>
      The bug Galina reported is false positive -- someone remove
    --triple=xxx in the testing case, which disable the the
    optimization. <br>
    Now the testing case is moved to the arch-specific folder. It should
    be ok now. <br>
    <br>
       >segfault on certain loops.<br>
      Why not show me the loop? <br>
    <br>
     > there are pretty pervasive problems with this patch.<br>
     In you long mail, I don't see any real problems other than some
    typos and few code coding std issue. <br>
     Some "style" is the combination of different reviewer's tastes. 
    Some reviewers' taste are just opposite to you personal opinions. <br>
    <br>
     Also see the following interleaving comment. <br>
    <br>
    <blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
      type="cite">
      <div style="font-family: arial, helvetica, sans-serif; font-size:
        10pt">
        <div dir="ltr">
          <div class="gmail_extra">
            <div class="gmail_quote">
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex"> 
                public:<br>
                +  /// PopcntHwSupport - Hardware support for population
                count. Compared to the<br>
                +  /// SW implementation, HW support is supposed to
                significantly boost the<br>
                +  /// performance when the population is dense, and it
                may or not may degrade<br>
              </blockquote>
              <div><br>
              </div>
              <div style="">"it may or may not" rather than "it may or
                not may".</div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    This is *REAL* problem. a typo. <br>
    <blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
      type="cite">
      <div style="font-family: arial, helvetica, sans-serif; font-size:
        10pt">
        <div dir="ltr">
          <div class="gmail_extra">
            <div class="gmail_quote">
              <div> </div>
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex">
                +  /// performance if the population is sparse. A HW
                support is considered as<br>
                +  /// "Fast" if it can outperform, or is on a par with,
                SW implementaion when<br>
                +  /// the population is sparse; otherwise, it is
                considered as "Slow".<br>
              </blockquote>
              <div><br>
              </div>
              <div style="">I wonder, why not express this in the
                interface? You could query for fast/slow with an
                expected density?</div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    Why not?<br>
    for the loop like this, slow HW support may outperform the SW
    version:<br>
    <br>
     for (i = 0; I < 32; I++)<br>
       cnt += (val >> 1) & 1;<br>
    <br>
    <br>
    <br>
    <blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
      type="cite">
      <div style="font-family: arial, helvetica, sans-serif; font-size:
        10pt">
        <div dir="ltr">
          <div class="gmail_extra">
            <div class="gmail_quote">
              <div style=""><br>
              </div>
              +  enum PopcntHwSupport {<br>
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex">
                +    None,<br>
                +    Fast,<br>
                +    Slow<br>
                +  };<br>
              </blockquote>
              <div><br>
              </div>
              <div style="">Please follow the coding standards for
                naming enum types and enumerators.</div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    Why it doesn't conform the std?<br>
    <br>
    <blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
      type="cite">
      <div style="font-family: arial, helvetica, sans-serif; font-size:
        10pt">
        <div dir="ltr">
          <div class="gmail_extra">
            <div class="gmail_quote">
              <div> </div>
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex">
                +<br>
                   virtual ~ScalarTargetTransformInfo() {}<br>
                <br>
                   /// isLegalAddImmediate - Return true if the
                specified immediate is legal<br>
                @@ -122,6 +134,11 @@<br>
                   virtual bool shouldBuildLookupTables() const {<br>
                     return true;<br>
                   }<br>
                +<br>
                +  /// getPopcntHwSupport - Return hardware support for
                population count.<br>
              </blockquote>
              <div><br>
              </div>
              <div style="">Please follow the doxygen examples in the
                coding standards and don't duplicate function names in
                comments.</div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    Ok, I miss this one. <br>
    <br>
    <blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
      type="cite">
      <div style="font-family: arial, helvetica, sans-serif; font-size:
        10pt">
        <div dir="ltr">
          <div class="gmail_extra">
            <div class="gmail_quote">
              <div> </div>
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex">
                +  virtual PopcntHwSupport getPopcntHwSupport(unsigned
                IntTyWidthInBit) const {<br>
              </blockquote>
              <div><br>
              </div>
              <div style="">Why accept the integer type width? does it
                matter other than it being a legal integer width for the
                target?</div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    The cost are different for int-width. <br>
    <br>
    <blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
      type="cite">
      <div style="font-family: arial, helvetica, sans-serif; font-size:
        10pt">
        <div dir="ltr">
          <div class="gmail_extra">
            <div class="gmail_quote">
              <div> </div>
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex">
                +    return None;<br>
                +  }<br>
                 };<br>
                <br>
                 /// VectorTargetTransformInfo - This interface is used
                by the vectorizers<br>
                <br>
                Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp<br>
                URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=168931&r1=168930&r2=168931&view=diff"
                  target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=168931&r1=168930&r2=168931&view=diff</a><br>
==============================================================================<br>
                --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
                (original)<br>
                +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Thu
                Nov 29 13:38:54 2012<br>
                @@ -17670,6 +17670,17 @@<br>
                   return -1;<br>
                 }<br>
                <br>
                +ScalarTargetTransformInfo::PopcntHwSupport<br>
                +X86ScalarTargetTransformImpl::getPopcntHwSupport(unsigned
                TyWidth) const {<br>
                +  assert(isPowerOf2_32(TyWidth) && "Ty width
                must be power of 2");</blockquote>
              <div><br>
              </div>
              <div style="">This constraint seems strange. It certainly
                doesn't seem like it should be an assert.</div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    It catch the case when this function is fed with nonsense parameter,
    and it at least make sense for x86<br>
     <br>
    <blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
      type="cite">
      <div style="font-family: arial, helvetica, sans-serif; font-size:
        10pt">
        <div dir="ltr">
          <div class="gmail_extra">
            <div class="gmail_quote">
              <div style="">
                <br>
              </div>
              <div style="">If i have an i24, i cant count the
                population of it by counting the population of it
                zext'ed to 32 bits...</div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    You have to pass 32 when calling this function. <br>
    <blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
      type="cite">
      <div style="font-family: arial, helvetica, sans-serif; font-size:
        10pt">
        <div dir="ltr">
          <div class="gmail_extra">
            <div class="gmail_quote">
              <div> </div>
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex">
                 </blockquote>
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex">
                +  const X86Subtarget &ST =
                TLI->getTargetMachine().getSubtarget<X86Subtarget>();<br>
                +<br>
                +  // TODO: Currently the __builtin_popcount()
                implementation using SSE3<br>
                +  //   instructions is inefficient. Once the problem is
                fixed, we should<br>
                +  //   call ST.hasSSE3() instead of ST.hasSSE4().<br>
              </blockquote>
              <div><br>
              </div>
              <div style="">__builtin_popcount isn't the issue here,
                this is in LLVM. What is the actual issue? Why isn't
                "Slow' used here?</div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    Please check my original mail.  It is a real LLVM defect in generate
    optimal instruction sequence using SSE3. <br>
    <br>
    <blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
      type="cite">
      <div style="font-family: arial, helvetica, sans-serif; font-size:
        10pt">
        <div dir="ltr">
          <div class="gmail_extra">
            <div class="gmail_quote">
              <div style=""><br>
              </div>
              <div style="">Also LLVM uses FIXME in comments more than
                TODO.</div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    It is in my TODO list. <br>
    <blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
      type="cite">
      <div style="font-family: arial, helvetica, sans-serif; font-size:
        10pt">
        <div dir="ltr">
          <div class="gmail_extra">
            <div class="gmail_quote">
              <div> <br>
              </div>
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex"> 
                namespace {<br>
                +<br>
                +  class LoopIdiomRecognize;<br>
                +<br>
                +  /// This class defines some utility functions for
                loop idiom recognization.<br>
              </blockquote>
              <div><br>
              </div>
              <div style="">Please don't use a class to organize a
                collection of static methods. That is the role of a
                namespace.</div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    Some reviewers hate this idea.  It is not hard to imagine, this
    class will become more and more complicate.<br>
    At that point, it will not be a container for static functions any
    more. <br>
    <br>
    <blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
      type="cite">
      <div style="font-family: arial, helvetica, sans-serif; font-size:
        10pt">
        <div dir="ltr">
          <div class="gmail_extra">
            <div class="gmail_quote">
              <div style=""><br>
              </div>
              <div style="">And this doesn't need a namespace at all --
                this is in a single .cc file. Please just use static
                functions, that is is the overwhelmingly predominant
                pattern in LLVM.</div>
              <div> </div>
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex">
                +  class LIRUtil {<br>
                +  public:<br>
                +    /// Return true iff the block contains nothing but
                an uncondition branch<br>
                +    /// (aka goto instruction).<br>
                +    static bool isAlmostEmpty(BasicBlock *);<br>
              </blockquote>
              <div><br>
              </div>
              <div style="">Please use names for parameters in LLVM</div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
      type="cite">
      <div style="font-family: arial, helvetica, sans-serif; font-size:
        10pt">
        <div dir="ltr">
          <div class="gmail_extra">
            <div class="gmail_quote">
              <div style=""><br>
              </div>
              <div style="">Also, I don't think this needs a helper
                method. First, you *always* have a terminator
                instruction, the verify assures this. Typically, you
                would then test that &BB->begin() ==
                BB->getTerminatorInst().</div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    <blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
      type="cite">
      <div style="font-family: arial, helvetica, sans-serif; font-size:
        10pt">
        <div dir="ltr">
          <div class="gmail_extra">
            <div class="gmail_quote">
              <div style=""><br>
              </div>
              <div style="">Or is the point of this function to test for
                an empty BB *and* an unconditional branch? If so, why do
                you need to handle that -- SimplifyCFG should have
                removed such blocks already?</div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    I just to make sure this code work for any input. <br>
    I don't want to assume the input is CFG optimized. <br>
    <br>
    <blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
      type="cite">
      <div style="font-family: arial, helvetica, sans-serif; font-size:
        10pt">
        <div dir="ltr">
          <div class="gmail_extra">
            <div class="gmail_quote">
              <div>
                 </div>
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex">
                +<br>
                +    static BranchInst *getBranch(BasicBlock *BB) {<br>
                +      return
                dyn_cast<BranchInst>(BB->getTerminator());<br>
                +    }<br>
              </blockquote>
              <div><br>
              </div>
              <div style="">Abstracting this into a method doesn't seem
                to help readability at all. I would rather just see the
                dyn_cast.</div>
              <div> </div>
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex">
                +<br>
                +    /// Return the condition of the branch terminating
                the given basic block.<br>
                +    static Value *getBrCondtion(BasicBlock *);<br>
              </blockquote>
              <div><br>
              </div>
              <div style="">What if it doesn't terminate in a branch?
                I'm not really sure what abstraction you're trying to
                provide here.</div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    Then return 0, the tiny code says that. <br>
    <br>
    <blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
      type="cite">
      <div style="font-family: arial, helvetica, sans-serif; font-size:
        10pt">
        <div dir="ltr">
          <div class="gmail_extra">
            <div class="gmail_quote">
              <div>
                <br>
              </div>
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex">
                +<br>
                +    /// Derive the precondition block (i.e the block
                that guards the loop<br>
                +    /// preheader) from the given preheader.<br>
                +    static BasicBlock *getPrecondBb(BasicBlock
                *PreHead);<br>
              </blockquote>
              <div><br>
              </div>
              <div style="">I don't understand this.. What is the
                invariant it exposes? What does it test first? Why is
                this not a method on LoopInfo if this is generally
                useful?</div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    I'm afraid I don't understand what you are saying? <br>
    Did I say "invariant"? <br>
    <br>
    <blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
      type="cite">
      <div style="font-family: arial, helvetica, sans-serif; font-size:
        10pt">
        <div dir="ltr">
          <div class="gmail_extra">
            <div class="gmail_quote">
              <div> </div>
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex">
                +  };<br>
                +<br>
                +  /// This class is to recoginize idioms of
                population-count conducted in<br>
                +  /// a noncountable loop. Currently it only recognizes
                this pattern:<br>
                +  /// \code<br>
                +  ///   while(x) {cnt++; ...; x &= x - 1; ...}<br>
                +  /// \endcode<br>
                +  class NclPopcountRecognize {<br>
              </blockquote>
              <div><br>
              </div>
              <div style="">If NCL is an acronym you want to use,
                capitalize it as an acronym. I don't think it is a
                terribly useful acronym. I would call this just
                PopcountLoopRecognizer (note a noun, as this is a
                class), and document what sets of loops it handles
                currently. If there is some reason we wouldn't re-use
                this infrastructure for any future extensions, you
                should explain that here.</div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    Population count has tons of variants. <br>
    The loops could be countable loop or non-countable loop.  I don't
    think it is good idea to recognize all variants in one class. <br>
    Since the way we recognize a pattern in non-countable loop and
    countable loop are quite different. <br>
    We have some way to differentiate them in the name. <br>
    If "NCL" doesn't seem to be a good one, what is you suggession?<br>
    <br>
    <blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
      type="cite">
      <div style="font-family: arial, helvetica, sans-serif; font-size:
        10pt">
        <div dir="ltr">
          <div class="gmail_extra">
            <div class="gmail_quote">
              <div style=""><br>
              </div>
              <div style=""><br>
              </div>
              <div style="">Also, have you thought much about the design
                of these classes? Currently, we have LoopIdiomRecognize
                which is a pass, and which has innate logic for forming
                memset intrinsics out of loops. But we now also have a
                non-pass helper class to recognize a specific kind of
                loop. How would you envision this design scaling up to
                the next loop idiom?</div>
              <div style=""><br>
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    Yes, I did. <br>
    It was a big class trying to recognize all interesting pattern. I
    don't want to make it become messy. <br>
    That is why I implement the popcount in a separate class. <br>
    In the future, I'd like new non-trivial recognization is also
    implemented in separate class as well. <br>
    That said, I don't like pull what we have in the pass to separate
    classes. <br>
    We have to make change one step at a time. <br>
    <br>
    ...<br>
  </body>
</html>