<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
HI, Chandler:<br>
<br>
 The bug Galina reported is false positive -- someone remove
--triple=xxx in the testing case, which disable the the
optimization. <br>
Now the testing case is moved to the arch-specific folder. It should
be ok now. <br>
<br>
  >segfault on certain loops.<br>
 Why not show me the loop? <br>
<br>
 > there are pretty pervasive problems with this patch.<br>
 In you long mail, I don't see any real problems other than some
typos and few code coding std issue. <br>
 Some "style" is the combination of different reviewer's tastes.Â
Some reviewers' taste are just opposite to you personal opinions. <br>
<br>
 Also see the following interleaving comment. <br>
<br>
<blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
type="cite">
<div style="font-family: arial, helvetica, sans-serif; font-size:
10pt">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">Â
public:<br>
+ Â /// PopcntHwSupport - Hardware support for population
count. Compared to the<br>
+ Â /// SW implementation, HW support is supposed to
significantly boost the<br>
+ Â /// performance when the population is dense, and it
may or not may degrade<br>
</blockquote>
<div><br>
</div>
<div style="">"it may or may not" rather than "it may or
not may".</div>
</div>
</div>
</div>
</div>
</blockquote>
This is *REAL* problem. a typo. <br>
<blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
type="cite">
<div style="font-family: arial, helvetica, sans-serif; font-size:
10pt">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>Â </div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
+ Â /// performance if the population is sparse. A HW
support is considered as<br>
+ Â /// "Fast" if it can outperform, or is on a par with,
SW implementaion when<br>
+ Â /// the population is sparse; otherwise, it is
considered as "Slow".<br>
</blockquote>
<div><br>
</div>
<div style="">I wonder, why not express this in the
interface? You could query for fast/slow with an
expected density?</div>
</div>
</div>
</div>
</div>
</blockquote>
Why not?<br>
for the loop like this, slow HW support may outperform the SW
version:<br>
<br>
 for (i = 0; I < 32; I++)<br>
  cnt += (val >> 1) & 1;<br>
<br>
<br>
<br>
<blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
type="cite">
<div style="font-family: arial, helvetica, sans-serif; font-size:
10pt">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div style=""><br>
</div>
+ Â enum PopcntHwSupport {<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
+ Â Â None,<br>
+ Â Â Fast,<br>
+ Â Â Slow<br>
+ Â };<br>
</blockquote>
<div><br>
</div>
<div style="">Please follow the coding standards for
naming enum types and enumerators.</div>
</div>
</div>
</div>
</div>
</blockquote>
Why it doesn't conform the std?<br>
<br>
<blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
type="cite">
<div style="font-family: arial, helvetica, sans-serif; font-size:
10pt">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>Â </div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
+<br>
  virtual ~ScalarTargetTransformInfo() {}<br>
<br>
  /// isLegalAddImmediate - Return true if the
specified immediate is legal<br>
@@ -122,6 +134,11 @@<br>
  virtual bool shouldBuildLookupTables() const {<br>
   return true;<br>
  }<br>
+<br>
+ Â /// getPopcntHwSupport - Return hardware support for
population count.<br>
</blockquote>
<div><br>
</div>
<div style="">Please follow the doxygen examples in the
coding standards and don't duplicate function names in
comments.</div>
</div>
</div>
</div>
</div>
</blockquote>
Ok, I miss this one. <br>
<br>
<blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
type="cite">
<div style="font-family: arial, helvetica, sans-serif; font-size:
10pt">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>Â </div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
+ Â virtual PopcntHwSupport getPopcntHwSupport(unsigned
IntTyWidthInBit) const {<br>
</blockquote>
<div><br>
</div>
<div style="">Why accept the integer type width? does it
matter other than it being a legal integer width for the
target?</div>
</div>
</div>
</div>
</div>
</blockquote>
The cost are different for int-width. <br>
<br>
<blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
type="cite">
<div style="font-family: arial, helvetica, sans-serif; font-size:
10pt">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>Â </div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
+ Â Â return None;<br>
+ Â }<br>
 };<br>
<br>
 /// VectorTargetTransformInfo - This interface is used
by the vectorizers<br>
<br>
Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp<br>
URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=168931&r1=168930&r2=168931&view=diff"
target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=168931&r1=168930&r2=168931&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
(original)<br>
+++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Thu
Nov 29 13:38:54 2012<br>
@@ -17670,6 +17670,17 @@<br>
  return -1;<br>
 }<br>
<br>
+ScalarTargetTransformInfo::PopcntHwSupport<br>
+X86ScalarTargetTransformImpl::getPopcntHwSupport(unsigned
TyWidth) const {<br>
+ Â assert(isPowerOf2_32(TyWidth) && "Ty width
must be power of 2");</blockquote>
<div><br>
</div>
<div style="">This constraint seems strange. It certainly
doesn't seem like it should be an assert.</div>
</div>
</div>
</div>
</div>
</blockquote>
It catch the case when this function is fed with nonsense parameter,
and it at least make sense for x86<br>
 <br>
<blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
type="cite">
<div style="font-family: arial, helvetica, sans-serif; font-size:
10pt">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div style="">
<br>
</div>
<div style="">If i have an i24, i cant count the
population of it by counting the population of it
zext'ed to 32 bits...</div>
</div>
</div>
</div>
</div>
</blockquote>
You have to pass 32 when calling this function. <br>
<blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
type="cite">
<div style="font-family: arial, helvetica, sans-serif; font-size:
10pt">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>Â </div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
 </blockquote>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
+ Â const X86Subtarget &ST =
TLI->getTargetMachine().getSubtarget<X86Subtarget>();<br>
+<br>
+ Â // TODO: Currently the __builtin_popcount()
implementation using SSE3<br>
+ Â // Â instructions is inefficient. Once the problem is
fixed, we should<br>
+ Â // Â call ST.hasSSE3() instead of ST.hasSSE4().<br>
</blockquote>
<div><br>
</div>
<div style="">__builtin_popcount isn't the issue here,
this is in LLVM. What is the actual issue? Why isn't
"Slow' used here?</div>
</div>
</div>
</div>
</div>
</blockquote>
Please check my original mail. It is a real LLVM defect in generate
optimal instruction sequence using SSE3. <br>
<br>
<blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
type="cite">
<div style="font-family: arial, helvetica, sans-serif; font-size:
10pt">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div style=""><br>
</div>
<div style="">Also LLVM uses FIXME in comments more than
TODO.</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
It is in my TODO list. <br>
<blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
type="cite">
<div style="font-family: arial, helvetica, sans-serif; font-size:
10pt">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>Â <br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">Â
namespace {<br>
+<br>
+ Â class LoopIdiomRecognize;<br>
+<br>
+ Â /// This class defines some utility functions for
loop idiom recognization.<br>
</blockquote>
<div><br>
</div>
<div style="">Please don't use a class to organize a
collection of static methods. That is the role of a
namespace.</div>
</div>
</div>
</div>
</div>
</blockquote>
Some reviewers hate this idea. It is not hard to imagine, this
class will become more and more complicate.<br>
At that point, it will not be a container for static functions any
more. <br>
<br>
<blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
type="cite">
<div style="font-family: arial, helvetica, sans-serif; font-size:
10pt">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div style=""><br>
</div>
<div style="">And this doesn't need a namespace at all --
this is in a single .cc file. Please just use static
functions, that is is the overwhelmingly predominant
pattern in LLVM.</div>
<div>Â </div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
+ Â class LIRUtil {<br>
+ Â public:<br>
+ Â Â /// Return true iff the block contains nothing but
an uncondition branch<br>
+ Â Â /// (aka goto instruction).<br>
+ Â Â static bool isAlmostEmpty(BasicBlock *);<br>
</blockquote>
<div><br>
</div>
<div style="">Please use names for parameters in LLVM</div>
</div>
</div>
</div>
</div>
</blockquote>
<blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
type="cite">
<div style="font-family: arial, helvetica, sans-serif; font-size:
10pt">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div style=""><br>
</div>
<div style="">Also, I don't think this needs a helper
method. First, you *always* have a terminator
instruction, the verify assures this. Typically, you
would then test that &BB->begin() ==
BB->getTerminatorInst().</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
<blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
type="cite">
<div style="font-family: arial, helvetica, sans-serif; font-size:
10pt">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div style=""><br>
</div>
<div style="">Or is the point of this function to test for
an empty BB *and* an unconditional branch? If so, why do
you need to handle that -- SimplifyCFG should have
removed such blocks already?</div>
</div>
</div>
</div>
</div>
</blockquote>
I just to make sure this code work for any input. <br>
I don't want to assume the input is CFG optimized. <br>
<br>
<blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
type="cite">
<div style="font-family: arial, helvetica, sans-serif; font-size:
10pt">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>
 </div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
+<br>
+ Â Â static BranchInst *getBranch(BasicBlock *BB) {<br>
+ Â Â Â return
dyn_cast<BranchInst>(BB->getTerminator());<br>
+ Â Â }<br>
</blockquote>
<div><br>
</div>
<div style="">Abstracting this into a method doesn't seem
to help readability at all. I would rather just see the
dyn_cast.</div>
<div>Â </div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
+<br>
+ Â Â /// Return the condition of the branch terminating
the given basic block.<br>
+ Â Â static Value *getBrCondtion(BasicBlock *);<br>
</blockquote>
<div><br>
</div>
<div style="">What if it doesn't terminate in a branch?
I'm not really sure what abstraction you're trying to
provide here.</div>
</div>
</div>
</div>
</div>
</blockquote>
Then return 0, the tiny code says that. <br>
<br>
<blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
type="cite">
<div style="font-family: arial, helvetica, sans-serif; font-size:
10pt">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>
<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
+<br>
+ Â Â /// Derive the precondition block (i.e the block
that guards the loop<br>
+ Â Â /// preheader) from the given preheader.<br>
+ Â Â static BasicBlock *getPrecondBb(BasicBlock
*PreHead);<br>
</blockquote>
<div><br>
</div>
<div style="">I don't understand this.. What is the
invariant it exposes? What does it test first? Why is
this not a method on LoopInfo if this is generally
useful?</div>
</div>
</div>
</div>
</div>
</blockquote>
I'm afraid I don't understand what you are saying? <br>
Did I say "invariant"? <br>
<br>
<blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
type="cite">
<div style="font-family: arial, helvetica, sans-serif; font-size:
10pt">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>Â </div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
+ Â };<br>
+<br>
+ Â /// This class is to recoginize idioms of
population-count conducted in<br>
+ Â /// a noncountable loop. Currently it only recognizes
this pattern:<br>
+ Â /// \code<br>
+ Â /// Â while(x) {cnt++; ...; x &= x - 1; ...}<br>
+ Â /// \endcode<br>
+ Â class NclPopcountRecognize {<br>
</blockquote>
<div><br>
</div>
<div style="">If NCL is an acronym you want to use,
capitalize it as an acronym. I don't think it is a
terribly useful acronym. I would call this just
PopcountLoopRecognizer (note a noun, as this is a
class), and document what sets of loops it handles
currently. If there is some reason we wouldn't re-use
this infrastructure for any future extensions, you
should explain that here.</div>
</div>
</div>
</div>
</div>
</blockquote>
Population count has tons of variants. <br>
The loops could be countable loop or non-countable loop. I don't
think it is good idea to recognize all variants in one class. <br>
Since the way we recognize a pattern in non-countable loop and
countable loop are quite different. <br>
We have some way to differentiate them in the name. <br>
If "NCL" doesn't seem to be a good one, what is you suggession?<br>
<br>
<blockquote
cite="mid:CAGCO0KhsyGpJjT6zNTfVGOa2NXtGrVpYJO4CuAW6bSdWZQn2ww@mail.gmail.com"
type="cite">
<div style="font-family: arial, helvetica, sans-serif; font-size:
10pt">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div style=""><br>
</div>
<div style=""><br>
</div>
<div style="">Also, have you thought much about the design
of these classes? Currently, we have LoopIdiomRecognize
which is a pass, and which has innate logic for forming
memset intrinsics out of loops. But we now also have a
non-pass helper class to recognize a specific kind of
loop. How would you envision this design scaling up to
the next loop idiom?</div>
<div style=""><br>
</div>
</div>
</div>
</div>
</div>
</blockquote>
Yes, I did. <br>
It was a big class trying to recognize all interesting pattern. I
don't want to make it become messy. <br>
That is why I implement the popcount in a separate class. <br>
In the future, I'd like new non-trivial recognization is also
implemented in separate class as well. <br>
That said, I don't like pull what we have in the pass to separate
classes. <br>
We have to make change one step at a time. <br>
<br>
...<br>
</body>
</html>