<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Artem Dergachev <<a href="mailto:noqnoqneo@gmail.com">noqnoqneo@gmail.com</a>> ezt írta (időpont: 2019. márc. 7., Cs, 1:38):<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
An unrelated humble/funny project is now officially up on
<a class="gmail-m_4103929222761871087moz-txt-link-freetext" href="http://llvm.org/OpenProjects.html#analyze-llvm" target="_blank">http://llvm.org/OpenProjects.html#analyze-llvm</a><br>
<br>
<div class="gmail-m_4103929222761871087moz-cite-prefix">On 3/2/19 6:50 AM, Gábor Horváth wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>Hi!</div>
<div><br>
</div>
<div>I think I could have the same level of commitment I had
last summer. If you find that sufficient, I am happy to help
with the mentoring!<br>
</div>
</div>
</blockquote>
<br>
Yay! I mean, regardless of how the final setup would look, i'd be
happy to see you intervene :)<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>I do agree with Artem, if you just started to familiarize
yourself with the topic, I would suggest starting with
implementing some algorithms from scratch (or tweak existing
ones). While improving existing algorithms is always nice, and
it is good to get rid of some technical debt, it is sometimes
harder to write a good report in the end. <br>
</div>
<div><br>
</div>
<div>Some random ideas that I would love to see implemented:</div>
<div>* Points to analysis would have a number of uses like more
precise loop widening, call graph construction, ability to
experiment with other invalidation schemes</div>
<div>* Static backward slicing could help us shorten bug paths
and filter events</div>
</div>
</blockquote>
<br>
Whoa, this one sounds really interesting and also very impactful.
We're right now pretty bad at figuring out which events are relevant
and which aren't, and it constantly bothers me (one of the latest
examples is <a class="gmail-m_4103929222761871087moz-txt-link-freetext" href="https://bugs.llvm.org/show_bug.cgi?id=40913#c4" target="_blank">https://bugs.llvm.org/show_bug.cgi?id=40913#c4</a>). Maybe
we could even poke that as a separate smaller project? Another good
thing about this smaller project is that i don't know how to
implement that, but it sounds like you do - which means that we
could co-mentor it as part of GSoC (:<br></div></blockquote><div><br></div><div>Currently, this is the one topic that interests me the most, it sounds like an achievable, but "tough enough" task for a summer.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div bgcolor="#FFFFFF">
<br>
<blockquote type="cite">
<div dir="ltr">
<div>* For some checks even just a set of dominators/post
dominators for each basic block would be useful, see Kristof's
uninitialized value check's "guarded member" heuristic</div></div></blockquote></div></blockquote><div>Before I make a former proposal however, I'll implement this as an exercise, because (according to Gábor) this is a project that could be implemented in a week(ish). First, I'll add some code to my checker, and see what it takes to make this functionality available to every checker -- this should give me a good idea on how this entire project could look like.</div><div><br></div><div>Thank you so much for the encouragement by the way, I'm super excited to work on this! :)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div bgcolor="#FFFFFF"><blockquote type="cite"><div dir="ltr">
<div>* Maybe some better measure of function complexity for the
analyzer's inlining decision? As far as I remember we use the
number of basic blocks now.</div>
<div>* Maybe some static edge frequency estimate for the CFG to
be able to experiment with other exploration strategies?
Taking the most or least frequently taken edge leading to an
unvisited node sounds like a fun experiment, but nothing
guarantee that the end result will actually be useful.<br>
</div>
<div><br>
</div>
<div>I think it is also ok to write multiple proposals, but you
should aim for quality over quantity.<br>
</div>
<div><br>
</div>
<div>Regards,</div>
<div>Gabor<br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Sat, 2 Mar 2019 at 00:11,
Artem Dergachev <<a href="mailto:noqnoqneo@gmail.com" target="_blank">noqnoqneo@gmail.com</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hey!<br>
<br>
First of all, yup, we're totally doing GSoC this year! I'll
do my best<br>
to finally force myself to update the open projects list
soon with a<br>
funny random idea that some folks around there are
interested in :)<br>
Now, also i'm alone these days since George changed jobs;
Gabor, are<br>
you interested in / capable of helping out again? 'Cause
they'll most<br>
likely only let me pick one student, while we already have a
few<br>
people looking into participating :)<br>
<br>
Data flow... Well, this one's definitely wanted, but that's
a big one,<br>
definitely bigger than any of the projects that we've tried
so far :)<br>
I guess such project would aim at allowing data-flow
checkers that<br>
don't have to write down transfer functions for every kind
of<br>
statement in every checker, but instead rely on the engine
to handle<br>
common basic effects of statements. If done properly, it
pretty much<br>
means writing a new Static Analyzer, which is more
complicated than<br>
the old one because it also needs its state merge operations
defined.<br>
<br>
There may be "poor-man's" solutions to this, which would
allow us to<br>
have a bit of progress without developing too much complex
machinery.<br>
One such solution is to introduce careful (conservative)
tracking of<br>
dropped execution paths in the Static Analyzer; it would
allow finding<br>
bugs that require analysis of all paths as long as the
Analyzer is<br>
known to have explored the function in its entirety. In
practice<br>
that'd be a lot of dropped coverage (much more than a proper<br>
solution), but it'd still probably find *something*, while
allowing us<br>
to re-use most of the infrastructure. This, of course, is a
dead-end<br>
in the long run - if we ever want to regain the lost
coverage, we'd<br>
have to start from scratch.<br>
<br>
So my overall feel here is that i'd love to hear a specific
proposal,<br>
but i suspect that if you go for doing this properly, one
summer would<br>
only be enough to scratch the surface. If you want to have a
feel of<br>
what it'd take, i'd recommend poking our existing data flow
analyses.<br>
Eg., what would it take to remove my C++ "object under
construction"<br>
liveness hacks by improving live variables analysis? You
might also<br>
want to write a few analyses of your own and see if you can
generalize<br>
something out of them. Eg., if you write an analysis that
checks<br>
whether a function is pure, we can instantly use it to
improve our<br>
conservative evaluation of pure functions (eg., produce the
same<br>
return value symbol every time it's called with the same
parameters<br>
and skip invalidations - we may even try to do it across
translation<br>
units because such summary is easy to serialize).<br>
</blockquote>
</div>
</div>
</blockquote>
<br>
</div>
</blockquote></div></div>