<div dir="ltr"><div class="gmail_extra">0003:</div><div class="gmail_extra"><br></div><div class="gmail_extra">+ // Mostly cloned from BitcodeWriter, but simplified a bit.<br></div><div class="gmail_extra"><br></div><div class="gmail_extra">
Please remove that comment, it isn't adding anything for the reader.</div><div class="gmail_extra"><br></div><div class="gmail_extra">And submit.</div><div class="gmail_extra"><br></div><div class="gmail_extra">0004:</div>
<div class="gmail_extra"><br></div><div class="gmail_extra">I *really* like the comment block!! This is going to save developers so much time!</div><div class="gmail_extra"><br></div><div class="gmail_extra">No changes, please submit!</div>
<div class="gmail_extra"><br></div><div class="gmail_extra">0005:</div><div class="gmail_extra"><br></div><div class="gmail_extra">Perfect. Please submit.</div><div class="gmail_extra"><br></div><div class="gmail_extra">
0006:</div>
<div class="gmail_extra"><br></div><div class="gmail_extra">+ // TODO: Already checked in cmpOps<br></div><div class="gmail_extra"><br></div><div class="gmail_extra">cmpOps?</div><div class="gmail_extra"><br></div>
<div class="gmail_extra">Nick</div><div class="gmail_extra"><div><br></div><div class="gmail_quote">On 15 May 2014 06:32, Stepan Dyatkovskiy <span dir="ltr"><<a href="mailto:stpworld@narod.ru" target="_blank">stpworld@narod.ru</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Hi Nick,<br>
What about rest of patches? I have fixed 0004 and 0005, others seems to be OK for you.. Can I commit them?<br>
<br>
Thanks!<br>
-Stepan<br>
<br>
Stepan Dyatkovskiy wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
Hi Nick,<br>
Thanks you for reviews. I have corrected for of comments from FAQ to<br>
what seems to be regular in LLVM.<br>
Patches has been committed as r208173 and r208189.<br>
<br>
Thanks!<br>
-Stepan<br>
<br>
Nick Lewycky wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
On 29 April 2014 04:36, Stepan Dyatkovskiy <<a href="mailto:stpworld@narod.ru" target="_blank">stpworld@narod.ru</a><br>
<mailto:<a href="mailto:stpworld@narod.ru" target="_blank">stpworld@narod.ru</a>>> wrote:<br>
<br>
Hi Nick.<br>
<br>
I have fixed and reattached all the patches from 0001 till 0011.<br>
<br>
0001, 0002:<br>
I have added detailed order descriptions to the methods declaration.<br>
I have also fixed everything as you asked.<br>
<br>
<br>
0001:<br>
<br>
+ for (uint64_t i = 0; i < NumElementsL; ++i) {<br>
vs.<br>
+ for (unsigned i = 0, e = NumElementsL; i != e; ++i) {<br>
<br>
This difference is necessary to avoid a warning? I don't actually care<br>
which form you use here given that NumElementsL is already an integer<br>
there's no recalculation we're avoiding like getting the end iterator<br>
each time, I just want it to look consistent. If it's working around a<br>
warning that's more important than consistent looks.<br>
<br>
0001 looks good. Please commit it.<br>
<br>
0002:<br>
<br>
One note, you use a writing style of asking a question, followed by<br>
answering it. For instance, "Will we still get the same numbering? As<br>
follows from FunctionComparator::compare(), we do CFG walk, ...". I<br>
don't think it's worth your time to rewrite it now, but we generally<br>
don't use that construction in llvm's comments, instead preferring to<br>
explain what the code does an including an argument that explains why<br>
the approach is correct. This amounts to the same thing, it's just a<br>
different style.<br>
<br>
+ /// their functions. If right value was meet first during<br>
scanning,<br>
<br>
"was meet" --> "was met"<br>
<br>
Please commit it.<br>
<br>
Nick<br>
<br>
<br>
0003, cmpStrings<br>
StringRef::compare performs lexicographical comparison, and it isn't<br>
the fastest one.<br>
In case, when strings has different length, it still compares their<br>
prefixes lexicographically.<br>
We need ordering among strings set, but we don't need to compare<br>
strings contents lexicographically all the time. So we can compare<br>
their sizes first, and do lexicographical comparison only when we<br>
really need it: only when strings are equal in size.<br>
<br>
I have added short version of these explanations into cmpStrings. So<br>
we could get really well documented methods based on our posts ;-)<br>
Those were good questions and definitely the answers should be in<br>
documentation somewhere.<br>
<br>
0004: I have added ordering description.<br>
0005: Fixed.<br>
0007: Fixed.<br>
0009:<br>
<br>
> Okay, I'm confused. We do all this work to produce a 3-way<br>
comparison,<br>
> then we wrap it up in an operator< for a std::set that will only<br>
look at<br>
> less-than...<br>
Actually, on top level we don't need 3-way comparison. But we do<br>
need it to determine whether functions are equal on some stage, just<br>
to proceed to the next "less significant" stage.<br>
Consider two functions (where BB1 and BB11 are equal)<br>
<br>
"F1: {BB1, BB2}" and<br>
"F2: {BB11, BB3}".<br>
<br>
First we have to realize that BB1 and BB11 are equal. And then<br>
determine which one of BB2 and BB3 is less than another.<br>
So finally we got 3-way comparison: less, equal and greater (in fact<br>
just neither less nor equal).<br>
<br>
<br>
> When we start detecting equal functions, do we merge them<br>
immediately? If so, can't that cause functions already in the set to<br>
have their equality comparisons change<br>
Yes we merge them immediately. And no, it won't cause situation you<br>
have mentioned. Once we changed function X, we lookup for functions<br>
that uses X, remove them from FnTree (former FnSet) and put them<br>
into Deferred list. Then we fill up worklist with Deferred contents<br>
and repeat comparison routine for these functions.<br>
<br>
<br>
Thanks for reviews!<br>
<br>
-Stepan<br>
<br>
Stepan Dyatkovskiy wrote:<br>
<br>
ping<br>
Stepan Dyatkovskiy wrote:<br>
<br>
Hi Nick,<br>
Thank you for reviews! Here is my answers for first two<br>
patches.<br>
<br>
Though as additional, I think its important answer to this<br>
question<br>
first:<br>
> Okay, I'm confused. We do all this work to produce a<br>
3-way comparison,<br>
> then we wrap it up in an operator< for a std::set that<br>
will only<br>
look at<br>
> less-than...<br>
Actually, on top level we don't need 3-way comparison. But<br>
we do need it<br>
to determine whether functions are equal on some stage just<br>
to proceed<br>
to the next "less meaningful" stage.<br>
Consider two functions<br>
"F1: {BB1, BB2}" and<br>
"F2: {BB-equal-to-BB1, BB3}".<br>
First we have to realize that BB1 and BB-equal-to-BB1 are<br>
equal. And<br>
then determine which one of BB2 and BB3 is less.<br>
<br>
First patch.<br>
<br>
> + /// 1. Bitcastability.<br>
> + /// Check whether L's type could be losslessly<br>
bitcasted to R's<br>
type.<br>
> + /// On this stage method, in case when lossless<br>
bitcast is not<br>
possible<br>
> + /// method returns -1 or 1, thus also defining which<br>
type is<br>
greater in<br>
> + /// context of bitcastability.<br>
I have added detailed order description for this method. You<br>
also can<br>
find this description in cmpConstantsComment.txt.<br>
<br>
> Optional: "Return its pointers comparison result" -><br>
"Return the<br>
result<br>
> of comparing the types"?<br>
Oops! Fixed.<br>
<br>
> + if (TyL->getTypeID() == Type::X86_MMXTyID)<br>
> + TyLWidth = 64;<br>
> + else if (const VectorType *VecTyL =<br>
dyn_cast<VectorType>(TyL))<br>
> + TyLWidth = VecTyL->getBitWidth();<br>
><br>
> Yikes! We never merge function A using a vector and<br>
function B<br>
using an<br>
> x86_mmx type together, right?<br>
Fixed!<br>
<br>
> "Nor of TyL and TyR" -> "Neither Tyl nor TyR".<br>
><br>
> Optional: "Return its pointers comparison result" -><br>
"Return the<br>
result<br>
> of comparing the types"?<br>
Fixed.<br>
<br>
> + case Value::UndefValueVal: return<br>
cmpType(L->getType(),<br>
R->getType());<br>
><br>
> You already have L->getType() and R->getType() in TyL<br>
and TyR.<br>
Fixed. Sorry.<br>
<br>
Second patch.<br>
<br>
> I'm curious whether we guarantee that<br>
> we get the same numbering when comparing two functions<br>
if I simply<br>
> reorder the BasicBlocks in one of the functions, will we<br>
still assign<br>
> the same numberings or not?<br>
...<br>
> Why not just start at the beginning and run to<br>
> the end, relying on the SSA dominance property.<br>
Good questions!<br>
<br>
For first one I have added these notes:<br>
[quote]<br>
As follows from FunctionComparator::compare(), we do CFG<br>
walk, we start<br>
from entry, and then take each terminator. So it doesn't<br>
matter how in<br>
fact BBs are ordered in function. And since cmpValues are<br>
called during<br>
this walk, the numbering depends only on how BBs located<br>
inside the CFG.<br>
[/quote]<br>
<br>
As for second question, I have added next notes:<br>
<br>
[quote]<br>
We also can't use dominance properties. Consider situation:<br>
If we compare two instruction operands: first is usage of<br>
local variable<br>
AL from function FL, and second is usage of local variable<br>
AR from FR.<br>
We are still not able to compare operands of PHI, since<br>
those could be<br>
operands from further BBs we didn't scan yet.<br>
[/quote]<br>
<br>
You can find comments also in sn_mapLRComment.txt<br>
I have also described lexicographical ordering for cmpValues<br>
in method<br>
comments. I have also put them in separated txt file:<br>
cmpValuesComment.txt.<br>
<br>
> + if (L == R) return 0;<br>
> + // Compare C1 and the bitcast result.<br>
> + if (int Res = cmpConstants(ConstL, ConstR))<br>
> + return Res;<br>
> [...]<br>
> + return 0;<br>
><br>
> Isn't that just:<br>
> return cmpConstants(ConstL, ConstR);<br>
OOOPS! Fixed.<br>
<br>
I'm working on rest of patches.<br>
Thanks!<br>
<br>
-Stepan.<br>
<br>
Nick Lewycky wrote:<br>
<br>
Stepan Dyatkovskiy wrote:<br>
<br>
Hello Nick!<br>
<br>
Can we proceed to work on patch series?<br>
<br>
<br>
Just in case, I have reattached them to this post.<br>
Would you prefer to see them in<br>
<a href="http://reviews.llvm.org" target="_blank">http://reviews.llvm.org</a> ?<br>
<br>
<br>
Oddly, I prefer attachments, but we can use<br>
<a href="http://reviews.llvm.org" target="_blank">reviews.llvm.org</a> <<a href="http://reviews.llvm.org" target="_blank">http://reviews.llvm.org</a>> if you<br>
prefer?<br>
<br>
0001-constants-comparison.__<u></u>patch<br>
<br>
+ /// 1. Bitcastability.<br>
+ /// Check whether L's type could be losslessly<br>
bitcasted to R's type.<br>
+ /// On this stage method, in case when lossless<br>
bitcast is not<br>
possible<br>
+ /// method returns -1 or 1, thus also defining which<br>
type is<br>
greater in<br>
+ /// context of bitcastability.<br>
<br>
Looking at the code, it looks like the order is:<br>
1. first class types, by typeid<br>
2. vectors (by element size? by number of elements?<br>
the two<br>
multiplied together?)<br>
<br>
+ // Nor of TyL and TyR are values of first class<br>
type. Return<br>
+ // its pointers comparison result.<br>
<br>
"Nor of TyL and TyR" -> "Neither Tyl nor TyR".<br>
<br>
Optional: "Return its pointers comparison result" -><br>
"Return the result<br>
of comparing the types"?<br>
<br>
+ if (TyL->getTypeID() == Type::X86_MMXTyID)<br>
+ TyLWidth = 64;<br>
+ else if (const VectorType *VecTyL =<br>
dyn_cast<VectorType>(TyL))<br>
+ TyLWidth = VecTyL->getBitWidth();<br>
<br>
Yikes! We never merge function A using a vector and<br>
function B using an<br>
x86_mmx type together, right? Note what the LangRef says<br>
about x86_mmx:<br>
"The operations allowed on it are quite limited:<br>
parameters and return<br>
values, load and store, and bitcast."<br>
<br>
+ case Value::UndefValueVal: return<br>
cmpType(L->getType(),<br>
R->getType());<br>
<br>
You already have L->getType() and R->getType() in TyL<br>
and TyR. You also<br>
already have cmpType(TyL, TyR) in TypesRes. Please just<br>
reuse them in<br>
the rest of this function.<br>
<br>
<br>
0002-values-comparison-__<u></u>replacement-for-enumerate.__<u></u>patch<br>
<br>
+ // Assign serial numbers to values from left<br>
function, and values<br>
from<br>
+ // right function.<br>
+ // Explanation:<br>
+ // Being comparing functions we need to compare<br>
values we meet at<br>
left and<br>
+ // right sides.<br>
+ // Its easy to sort things out for external values.<br>
It just should be<br>
+ // the same value at left and right.<br>
+ // But for local values (those were introduced inside<br>
function body)<br>
+ // we have to ensure they were introduced at exactly<br>
the same place,<br>
+ // and plays the same role. How to do it?<br>
+ // Just assign serial number to each value when we<br>
meet it first<br>
time.<br>
+ // Values that were met at same place will be with<br>
same serial<br>
numbers.<br>
<br>
This explanation is correct, but I'm curious whether we<br>
guarantee that<br>
we get the same numbering when comparing two functions<br>
if I simply<br>
reorder the BasicBlocks in one of the functions, will we<br>
still assign<br>
the same numberings or not?<br>
<br>
Also, it's not clear from the comment why we need the<br>
complexity of<br>
numbering them anyways. Why not just start at the<br>
beginning and run to<br>
the end, relying on the SSA dominance property to<br>
guarantee that we<br>
visit definitions before uses, and assume we'll visit<br>
each value in the<br>
same order (look at each instruction and operand in<br>
order and just<br>
complain about differences)? The answer is that PHI<br>
nodes mean we will<br>
visit some uses before some definitions.<br>
<br>
Anyways, changing this text is optional in my mind. I<br>
think it could be<br>
better, but it's not worth spending much time on it.<br>
<br>
+int FunctionComparator::cmpValues(<u></u>__const Value *L,<br>
const Value *R) {<br>
<br>
I wouldn't mind an explanation of the lexographic<br>
ordering on this one<br>
either. Constants before non-constants, then InlineAsm<br>
(ordered by<br>
address), then visit order.<br>
<br>
+ if (L == R) return 0;<br>
+ // Compare C1 and the bitcast result.<br>
+ if (int Res = cmpConstants(ConstL, ConstR))<br>
+ return Res;<br>
[...]<br>
+ return 0;<br>
<br>
Isn't that just:<br>
return cmpConstants(ConstL, ConstR);<br>
?<br>
<br>
0003-attributes-comparison.__<u></u>patch<br>
<br>
+int FunctionComparator::__<u></u>cmpStrings(StringRef L,<br>
StringRef R) const {<br>
+ // Prevent heavy comparison, compare sizes first.<br>
+ if (int Res = cmpNumbers(L.size(), R.size()))<br>
+ return Res;<br>
+<br>
+ return L.compare(R);<br>
+}<br>
<br>
Optional: I'm curious whether that's actually worth it?<br>
I checked<br>
StringRef and it does the length comparison last. It<br>
could be that you<br>
know that your strings tend to be longer than StringRef<br>
which tries to<br>
be efficient in general.<br>
<br>
0004-operations-comparison.__<u></u>patch<br>
<br>
+int FunctionComparator::__<u></u>cmpOperation(const<br>
Instruction *L,<br>
<br>
Optional: Again, a description of the ordering would be<br>
interesting.<br>
Opcode number, then number of operands, then type, then<br>
'nsw/nuw/exact/volatile'.<br>
<br>
0005-GEP-comparison.patch<br>
<br>
+int FunctionComparator::cmpGEP(__<u></u>const GEPOperator<br>
*GEPL,<br>
+ const GEPOperator<br>
*GEPR) {<br>
<br>
+ unsigned int ASL = GEPL->getPointerAddressSpace()<u></u>__;<br>
<br>
Extra line after {<br>
<br>
if (DL) {<br>
[...]<br>
+ unsigned BitWidth = DL ?<br>
DL->getPointerSizeInBits(ASL) : 1;<br>
<br>
BitWidth doesn't need to test DL for null-ness.<br>
<br>
0006-top-level-compare-method.<u></u>__patch<br>
<br>
Perfect!<br>
<br>
0007-sanity-check.patch<br>
<br>
+ // Check symmetricity.<br>
<br>
"symmetricity" -> "symmetry".<br>
<br>
0008-removed-unused-methods.__<u></u>patch<br>
<br>
Perfect<br>
<br>
0009-FnSet-replaced-with-__<u></u>FnTree.patch<br>
<br>
Okay, I'm confused. We do all this work to produce a<br>
3-way comparison,<br>
then we wrap it up in an operator< for a std::set that<br>
will only look at<br>
less-than, but never uses the equals result. We want to<br>
do one<br>
comparison between two functions, not two (ie., std::set<br>
does<br>
func1<func2 and func2<func1 to see whether they're equal<br>
or less than,<br>
and we don't need that since cmp(func1, func2) tells us<br>
directly). Maybe<br>
this comes in a future patch?<br>
<br>
One other question, I should probably know the answer to<br>
this but it's<br>
been a while. :) When we start detecting equal<br>
functions, do we merge<br>
them immediately? If so, can't that cause functions<br>
already in the set<br>
to have their equality comparisons change (ie., A calls<br>
X, B calls Y, A<br>
and B are in the set, then we merge X and Y for being<br>
equal, now we've<br>
changed A and B to both called the same merged function,<br>
so they start<br>
to compare the same). Sets tend not to like having their<br>
comparison<br>
functions change on elements that they contain.<br>
<br>
0010-updated-comments.patch<br>
<br>
Okay.<br>
<br>
0011-removed-DenseMap-helpers.<u></u>__patch<br>
<br>
Yay deleting dead code!<br>
<br>
Nick<br>
<br>
<br>
Thanks!<br>
-Stepan.<br>
<br>
Stepan Dyatkovskiy wrote:<br>
<br>
ping<br>
Stepan Dyatkovskiy wrote:<br>
<br>
Hi Nick,<br>
<br>
Please find the fixed patches in attachment.<br>
Series starts from "constants comparison".<br>
<br>
Below small report of what has been fixed,<br>
and answers on your<br>
questions.<br>
<br>
cmpTypes:<br>
> Please add a comment for this method.<br>
Include the meaning of the<br>
returned value. ("man strcmp" for<br>
inspiration.)<br>
Fixed and committed. So you can look in<br>
trunk, may be I forgot to do<br>
something (hope not :-) )<br>
<br>
checkForLosslessBitcast, cmpConstants:<br>
> Why isn't this using cmpType?<br>
Fixed.<br>
<br>
> Please put the else on the same line as<br>
the closing brace.<br>
Fixed.<br>
<br>
>> + else if (const VectorType *thisPTy =<br>
dyn_cast<VectorType>(L))<br>
> Missing initial capital<br>
Sorry. Fixed. Actually these typos has been<br>
cloned from<br>
Type::canLosslesslyBitCastTo.<br>
<br>
>> + return cmpNumbers((uint64_t)L,<br>
(uint64_t)R);<br>
>><br>
> Please assert on unknown constant.<br>
That's possible. But what if we really got<br>
unknown constant? New<br>
comming<br>
constant types merely depends on<br>
MergeFunctions implementation. So we<br>
get crash if it will happen. And we loose<br>
nothing comparing them<br>
just as<br>
pointers. So, do we really need an<br>
assertion? Currently I kept it<br>
as it<br>
was. If your answer is "yes, we need it", it<br>
would easy to add it:<br>
<br>
case Value::FunctionVal:<br>
case Value::GlobalVariableVal:<br>
- case Value::GlobalAliasVal:<br>
- default: // Unknown constant<br>
- return cmpNumbers((uint64_t)L,<br>
(uint64_t)R);<br>
+ case Value::GlobalAliasVal:<br>
+ return cmpNumbers((uint64_t)L,<br>
(uint64_t)R);<br>
+ default:<br>
+ llvm_unreachable("Unknown constant.");<br>
<br>
About variable names: Var1, Var2 vs<br>
VarL,VarR.<br>
I think its better to use L/R concept. Easer<br>
to understand what to<br>
return (-1, or 1) when you see L/R rather<br>
than Var1/Var2.<br>
Var1/Var2 has been kept for functions that<br>
are to be removed till the<br>
end of patch series.<br>
F1 and F2 names were also kept till the "top<br>
level compare method"<br>
patch.<br>
<br>
> Alternatively, any reason not to have<br>
cmpPointers(const void *L,<br>
const void *R)?<br>
I could do it. I just wanted to show that<br>
pointers are compared<br>
like a<br>
numbers in some situations. Actually we do<br>
it in cases when we have<br>
absolutely no idea of smarter comparison.<br>
And back to cmpPonters. Its rather about<br>
what intuition tells. If<br>
pointers are equal as numbers, could they be<br>
different somehow? Could<br>
they be non-castable to numbers on some<br>
platforms? The last one is<br>
big<br>
trouble for DenseMapInfo implementation..<br>
If there is any (even very small)<br>
possibility of such cases - then<br>
yes,<br>
I vote for cmpPointers. The last word is up<br>
to you anyway.<br>
<br>
Attributes (0005):<br>
> Attributes already have operator< and<br>
operator==. Please reuse<br>
them.<br>
Fixed. I used simple "if":<br>
<br>
if (LA < RA)<br>
return -1;<br>
if (RA < LA)<br>
return 1;<br>
<br>
Though its possible to use:<br>
<br>
if (int Res = (LA < RA) ? -1 : (RA < LA) ? 1<br>
: 0)<br>
return Res;<br>
<br>
Which one is more preferable?<br>
<br>
cmpGEP (0007):<br>
> + int cmpGEP(const GEPOperator *GEP1,<br>
const GEPOperator *GEP2);<br>
> + int cmpGEP(const GetElementPtrInst<br>
*GEP1,<br>
> + const GetElementPtrInst *GEP2) {<br>
><br>
> Const members?<br>
We can't make it constant, since it compares<br>
values (cmpValues, aka<br>
enumerate). So, if we see Value first time,<br>
we have to add it into<br>
sn_mapL/R.<br>
<br>
> I think you should put this under if (DL)<br>
{ /* declare Offset1,<br>
Offset2, etc. */ }<br>
Fixed.<br>
<br>
About 0008:<br>
> Did you mean "cmpOperation"?<br>
Yes. We could keep it in one place. I have<br>
fixed this case and<br>
removed<br>
TODO.<br>
<br>
> "compare" still returns "bool". I'm going<br>
to assume this was<br>
meant to<br>
go in 0009.<br>
Yes.<br>
<br>
About 0011:<br>
> std::set is frowned upon, see<br>
<br>
<a href="http://llvm.org/docs/__ProgrammersManual.html#set" target="_blank">http://llvm.org/docs/__<u></u>ProgrammersManual.html#set</a><br>
<br>
<<a href="http://llvm.org/docs/ProgrammersManual.html#set" target="_blank">http://llvm.org/docs/<u></u>ProgrammersManual.html#set</a>><br>
. Do you actually<br>
rely<br>
on the stable iterator guarantee? Or would<br>
another set-like container<br>
be a<br>
better fit?<br>
Huh.. I have looked for alternatives.<br>
Unfortunately SmallSet is not<br>
suitable, since we need to lookup existing<br>
items, and SmallSet::find<br>
method is not present. May be I have missed<br>
something.<br>
Actually we need binary tree implementation.<br>
Do we have better<br>
analogue?<br>
I could implement new one. Think it should<br>
be just one more patch<br>
afterwards.<br>
<br>
>No. An identifier with underscore then<br>
capital letter is reserved<br>
for<br>
the implementation. You can just call them<br>
"F" and "DL", then the<br>
ctor<br>
initializer list can be "F(F), DL(DL)" and<br>
that will work fine.<br>
Fixed.<br>
<br>
0013:<br>
> Sorry, I'm not sure what this sentence<br>
means? The thing your<br>
example<br>
is talking about is showing is that two<br>
functions in an SCC may be<br>
equivalent, and detecting them requires<br>
"blinding" (ignoring) certain<br>
details of the function when you do the<br>
comparison. We blind<br>
ourselves<br>
to the pointee types, but not to callees of<br>
functions in the same<br>
SCC.<br>
<br>
I meant generalising of cross-reference case:<br>
in old implementation, while comparing F and<br>
G functions, we treat as<br>
equal case when F uses G, with case when G<br>
uses F (at the same<br>
place).<br>
It could happen that F uses G1, while G1<br>
uses G2, while G2 uses F.<br>
So it<br>
still the same. But currently we have to<br>
invent way how to detect<br>
such<br>
cases.<br>
<br>
Thanks for reviews!<br>
-Stepan<br>
<br>
Stepan Dyatkovskiy wrote:<br>
<br>
Hi Nick,<br>
I have committed 0001 as r203788.<br>
I'm working on fixes for 0002 - 0014.<br>
<br>
> After reading through this patch<br>
series, I feel like I'm missing<br>
> something important. Where's the sort<br>
function? It looks like<br>
we're<br>
> still comparing all functions to all<br>
other functions.<br>
When you insert functions into std::set<br>
or its analo g s it does all<br>
the<br>
job for you. Since internally it builds<br>
a binary tree using "less"<br>
comparison, and each insert/look-up<br>
operation takes O(log(N)) time.<br>
<br>
-Stepan.<br>
<br>
Nick Lewycky wrote:<br>
<br>
On 27 February 2014 08:23, Stepan<br>
Dyatkovskiy <<a href="mailto:stpworld@narod.ru" target="_blank">stpworld@narod.ru</a><br>
<mailto:<a href="mailto:stpworld@narod.ru" target="_blank">stpworld@narod.ru</a>><br>
<mailto:<a href="mailto:stpworld@narod.ru" target="_blank">stpworld@narod.ru</a><br>
<mailto:<a href="mailto:stpworld@narod.ru" target="_blank">stpworld@narod.ru</a>>>> wrote:<br>
<br>
Hi Nick,<br>
<br>
I tried to rework changes as you<br>
requested. One of patches (0004<br>
with extra assertions) has been<br>
removed.<br>
<br>
<br>
> + bool isEquivalentType(Type<br>
*Ty1, Type *Ty2) const {<br>
> + return cmpType(Ty1, Ty2) == 0;<br>
> + }<br>
><br>
> Why do we still need<br>
isEquivalentType? Can we nuke this?<br>
Yup. After applying all the patches<br>
isEquivalentType will be<br>
totally<br>
replaced with cmpType. All isEqXXXX<br>
and friends will be removed in<br>
0011 (old 0012). No traces left.<br>
Old function wasn't removed in 0001<br>
just for keeping patches<br>
without<br>
extra noise like:<br>
<br>
- something that uses<br>
isEquivalentType<br>
+ something that uses cmpType<br>
<br>
The point is, that "something" that<br>
uses isEquivalentType, will be<br>
also replaced with one of next<br>
patches in this series.<br>
<br>
<br>
><br>
> +static int cmpNumbers(uint64_t<br>
L, uint64_t R) {<br>
> + if (L < R) return -1;<br>
> + if (L > R) return 1;<br>
> + return 0;<br>
> +}<br>
><br>
> At a high level, you don't<br>
actually need a <=> operator to use a<br>
sort. A<br>
> strict ordering ( < operator) is<br>
sufficient. (Note that for<br>
mergefunc, a<br>
> strict weak ordering is not<br>
sufficient, it must be a total<br>
ordering.)<br>
That could be done with int<br>
FunctionComparator::compare(). We can<br>
replace it with bool<br>
FunctionComparator::less(). Though<br>
for all<br>
other cmp methods need at least 2<br>
bits of information as result:<br>
1. Whether things are equal.<br>
2. Whether left less than right.<br>
<br>
As for<br>
FunctionComparator::compare(),<br>
conversion into less() will<br>
increase time of sanity check (patch<br>
#0010).<br>
Sanity check is just a sweet bonus.<br>
It checks that ordering<br>
implemented properly (checks order<br>
relation properties).<br>
Turning compare() into less() mean,<br>
that we'll have to run<br>
comparison two times: L.less(R) and<br>
R.less(L). But may be sanity<br>
check is not a thing to be published<br>
at all.<br>
<br>
<br>
><br>
> Consider hoisting this inside the<br>
FunctionComparator class? That<br>
class<br>
> should have a bunch of<br>
implementations of comparisons<br>
between<br>
various<br>
> different things, which can pass<br>
down to other methods in the<br>
same class.<br>
In new patch series attached to this<br>
post, I have moved all static<br>
methods into FunctionComparator.<br>
<br>
<br>
> + // Replacement for<br>
type::canLosslesslyBitCastTo, that<br>
> + // establish order relation on<br>
this kind of properties.<br>
> + int<br>
checkForLosslessBitcast(const Type<br>
*L, const Type *R);<br>
><br>
> Type:: not type:: . Please make<br>
this comment more descriptive.<br>
Done.<br>
[new comment]<br>
Replacement for<br>
Type::canLosslesslyBitCastTo, that<br>
<br>
establish order relation on this<br>
kind of properties<br>
Returns 0, if L and R types could be<br>
converted to each other<br>
without<br>
reinterpretation of bits.<br>
Otherwise method returns -1 or 1,<br>
defining total ordering between<br>
types in context of lossless<br>
bitcastability trait.<br>
E.g.: if L is less than R (result is<br>
-1), than every type that<br>
could be<br>
losslessly bitcasted to L is less<br>
than R.<br>
[/new comment]<br>
<br>
<br>
><br>
> + /// Replacement for C1 ==<br>
getBitCast(C2, C1Ty)<br>
> + /// Its more controllable, and<br>
supposed to be simpler and<br>
more<br>
> predictionable.<br>
> + /// As very important<br>
advantage: it allows to introduce<br>
order<br>
relation on<br>
> + /// constants set, and thus use<br>
it as trait in refinement<br>
routines.<br>
><br>
> "Its" --> "It's".<br>
"predictionable" --> "predictable".<br>
And how is<br>
it more<br>
> predictable? I think this comment<br>
would be better if it<br>
described the<br>
> function instead of making<br>
comparisons between it and other<br>
functions.<br>
> Something like, "Compare<br>
constants under a system where<br>
pointer<br>
to X and<br>
> pointer to Y are considered<br>
equal" or whatever is actually true<br>
here.<br>
Done.<br>
[new comment]<br>
<br>
Replacement for C1 == getBitCast(C2,<br>
C1Ty)<br>
Parses constants contents, assuming<br>
that types are losslessly<br>
bitcasted between each other. So<br>
actually it ignores types and only<br>
compares bits from L and R.<br>
Returns 0, if L and R has equivalent<br>
content.<br>
-1 or 1 if values are different.<br>
Maintaining total ordering<br>
requires<br>
two values that indicates<br>
non-equivalence (L less R, L<br>
greater R).<br>
[/new comment]<br>
<br>
<br>
><br>
> +enum ConstantType {<br>
> I'm not sure that this buys you<br>
much. All the "isa" tests can be<br>
broken<br>
> down into a switch on<br>
getValueID() with the one<br>
exception of<br>
isNullValue().<br>
Done.<br>
<br>
<br>
> + assert(<br>
> +<br>
<br>
C1->getType()->____<u></u>canLosslesslyBitCastTo(C2->___<u></u>_getType())<br>
&&<br>
> + "Pass is healthless.<br>
checkForLosslessBitcast should be<br>
twin of "<br>
> + "canLosslesslyBitCastTo method,<br>
except the case when the<br>
last one "<br>
> + "returns false, the first one<br>
should return -1 or 1");<br>
...<br>
<br>
> I think we can skip the asserts<br>
here. They aren't detecting a<br>
specific<br>
> bug, they're checking whether the<br>
new code does a certain task<br>
relative<br>
> to the old code. Drop the old<br>
code, your new code is the new<br>
sheriff in<br>
> town.<br>
Asserts has been removed.<br>
<br>
<br>
><br>
> + DenseMap<const Value*, int><br>
sn_map1, sn_map2;<br>
><br>
> What is sn short for ("seen")?<br>
Why are there two of these?<br>
Serial number :-)<br>
<br>
><br>
> + std::pair<DenseMap<const Value<br>
*, int>::iterator, bool><br>
> + LeftSN =<br>
sn_map1.insert(std::make_pair(<u></u>____V1,<br>
sn_map1.size())),<br>
> + RightSN =<br>
sn_map2.insert(std::make_pair(<u></u>____V2,<br>
sn_map2.size()));<br>
><br>
> So I think I get it, this is easy<br>
to reason about. You number<br>
each value<br>
> going down both functions. But I<br>
think you can eliminate one of<br>
these<br>
> maps because you know that if the<br>
left and right were ever<br>
different<br>
> then we terminate the comparison<br>
immediately. You can at least<br>
share one<br>
> map with both V1 and V2, but I<br>
think you can reduce the number<br>
of map<br>
> insertions.<br>
Not sure. Consider that in left you<br>
have:<br>
%A = alloca i32<br>
%B = alloca i32<br>
store i32 1, i32* %A<br>
<br>
And in right:<br>
%A = alloca i32<br>
%B = alloca i32<br>
store i32 1, i32* %B<br>
<br>
When we meet 'store' instruction we<br>
need to check in which order %A<br>
was allocated at left and in which<br>
order %B was allocated at right.<br>
So for each value in function we<br>
have to assign serial number. Then<br>
comparing to local values from<br>
different functions we can just<br>
check<br>
their serial numbers.<br>
Though, may be I missed something?<br>
May be there is another way to<br>
determine "who was first".<br>
<br>
<br>
> High-level question, are<br>
attributes ordered? Your<br>
comparison is<br>
ordered.<br>
> So if I have function F with<br>
string attributes "sse2",<br>
"gc=hemisphere"<br>
> and another function G with<br>
string attributes "gc=hemisphere",<br>
"sse2"<br>
> then they would be considered</blockquote></blockquote>
</blockquote></div><br></div></div>