<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
On 07/07/2016 07:10 PM, Daniel Berlin wrote:<br>
<blockquote
cite="mid:CAF4BwTU+bfVC+kKvjjBHOLraQ_YEwzk0-R7S7BCBqzqLR5mH-w@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>Two thoughts after staring at it:<br>
<br>
</div>
<div>1. If you were to form SESE/etc regions and process
them in topo order, you could actually do even better than
you do now. You would be able to validly say "the first
X regions of this function are the same". You currently do
this in some cases, but you could do it in more :)</div>
</div>
</div>
</div>
</blockquote>
<br>
Yep, that's a good idea. You could then also outline identical
regions from functions... :) <br>
<br>
An easy improvement to the current traversal algorithm would be to
enqueue successor blocks in the order of their instruction count as
opposed to the order they appear in the terminator instruction. That
way, you could handle easy cases where the successor blocks are
swapped but otherwise equivalent or similar.<br>
<br>
<blockquote
cite="mid:CAF4BwTU+bfVC+kKvjjBHOLraQ_YEwzk0-R7S7BCBqzqLR5mH-w@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>2. You could integrate most of what you are doing with
mergefuncs with a bit of work.</div>
</div>
</div>
</div>
</blockquote>
<br>
Depends on how you define "a bit of work" ;) See below.<br>
<br>
<blockquote
cite="mid:CAF4BwTU+bfVC+kKvjjBHOLraQ_YEwzk0-R7S7BCBqzqLR5mH-w@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
As to whether this could be integrated into the in-tree
MergeFunctions... Well, this started off as a patch to
MergeFunctions back in 2013. However, the in-tree
MergeFunctions has undergone significant architectural
changes since then; it now uses a total ordering of
functions to speed up merging. This is great if you only
want to merge identical functions, but it doesn't work for
merging of similar functions.<br>
</blockquote>
<div><br>
</div>
<div>This is not correct.</div>
<div>You could use simhash, for example, or any thing that
will give you actual similarity metrics in addition to
total ordering.</div>
</div>
</div>
</div>
</blockquote>
<br>
What I meant to say is that it doesn't make sense to use the
*current* hash function and the *current* total ordering criterion
in the in-tree MergeFunctions if you want to merge similar
functions.<br>
<br>
Why? <br>
<br>
1. The hash function includes the instruction opcodes. Similar, but
non-identical, functions will end up in different buckets. We'd have
to replace the hash function by something less precise for similar
merging. Hashing the CFG structure is a neat idea though, I should
be doing that too.<br>
<br>
2. The total ordering criterion (used inside the buckets) in
MergeFunctions is based on equality. A very large chunk of the
MergeFunctions code deals with those comparisons (see all the cmpXYZ
functions). Similar functions won't be near each other if you use
that specific ordering criterion, so it's not much use. We'd need to
replace the criterion by something else - and yes, similarity
hashing is a great option for that (although not yet implemented so
unclear how well it'd work for IR). <br>
<br>
So essentially, to make this work in MergeFunctions, we'd have to
rip out two of its key advantages (which are great if you only want
to merge identical functions)... and you'd essentially get this pass
:)<br>
<br>
Am I missing something?<br>
<br>
Since the two MergeFunctions share a common ancestor, there is some
shared code; basically a shared skeleton. So it should be possible
to factor that out into a "MergeFunctionsUtils" sort of module.
Perhaps that's a way to go if people are happy with that?<br>
<br>
Thanks,<br>
Tobias<br>
<pre class="moz-signature" cols="72">--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project.
</pre>
</body>
</html>