<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Artem Dergachev <<a href="mailto:noqnoqneo@gmail.com">noqnoqneo@gmail.com</a>> ezt írta (időpont: 2019. márc. 7., Cs, 1:38):<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

  <div bgcolor="#FFFFFF">

    An unrelated humble/funny project is now officially up on

    <a class="gmail-m_4103929222761871087moz-txt-link-freetext" href="http://llvm.org/OpenProjects.html#analyze-llvm" target="_blank">http://llvm.org/OpenProjects.html#analyze-llvm</a><br>

    <br>

    <div class="gmail-m_4103929222761871087moz-cite-prefix">On 3/2/19 6:50 AM, Gábor Horváth wrote:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr">

        <div>Hi!</div>

        <div><br>

        </div>

        <div>I think I could have the same level of commitment I had

          last summer. If you find that sufficient, I am happy to help

          with the mentoring!<br>

        </div>

      </div>

    </blockquote>

    <br>

    Yay! I mean, regardless of how the final setup would look, i'd be

    happy to see you intervene :)<br>

    <br>

    <blockquote type="cite">

      <div dir="ltr">

        <div>I do agree with Artem, if you just started to familiarize

          yourself with the topic, I would suggest starting with

          implementing some algorithms from scratch (or tweak existing

          ones). While improving existing algorithms is always nice, and

          it is good to get rid of some technical debt, it is sometimes

          harder to write a good report in the end. <br>

        </div>

        <div><br>

        </div>

        <div>Some random ideas that I would love to see implemented:</div>

        <div>* Points to analysis would have a number of uses like more

          precise loop widening, call graph construction, ability to

          experiment with other invalidation schemes</div>

        <div>* Static backward slicing could help us shorten bug paths

          and filter events</div>

      </div>

    </blockquote>

    <br>

    Whoa, this one sounds really interesting and also very impactful.

    We're right now pretty bad at figuring out which events are relevant

    and which aren't, and it constantly bothers me (one of the latest

    examples is <a class="gmail-m_4103929222761871087moz-txt-link-freetext" href="https://bugs.llvm.org/show_bug.cgi?id=40913#c4" target="_blank">https://bugs.llvm.org/show_bug.cgi?id=40913#c4</a>). Maybe

    we could even poke that as a separate smaller project? Another good

    thing about this smaller project is that i don't know how to

    implement that, but it sounds like you do - which means that we

    could co-mentor it as part of GSoC (:<br></div></blockquote><div><br></div><div>Currently, this is the one topic that interests me the most, it sounds like an achievable, but "tough enough" task for a summer.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div bgcolor="#FFFFFF">

    <br>

    <blockquote type="cite">

      <div dir="ltr">

        <div>* For some checks even just a set of dominators/post

          dominators for each basic block would be useful, see Kristof's

          uninitialized value check's "guarded member" heuristic</div></div></blockquote></div></blockquote><div>Before I make a former proposal however, I'll implement this as an exercise, because (according to Gábor) this is a project that could be implemented in a week(ish). First, I'll add some code to my checker, and see what it takes to make this functionality available to every checker -- this should give me a good idea on how this entire project could look like.</div><div><br></div><div>Thank you so much for the encouragement by the way, I'm super excited to work on this! :)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div bgcolor="#FFFFFF"><blockquote type="cite"><div dir="ltr">

        <div>* Maybe some better measure of function complexity for the

          analyzer's inlining decision? As far as I remember we use the

          number of basic blocks now.</div>

        <div>* Maybe some static edge frequency estimate for the CFG to

          be able to experiment with other exploration strategies?

          Taking the most or least frequently taken edge leading to an

          unvisited node sounds like a fun experiment, but nothing

          guarantee that the end result will actually be useful.<br>

        </div>

        <div><br>

        </div>

        <div>I think it is also ok to write multiple proposals, but you

          should aim for quality over quantity.<br>

        </div>

        <div><br>

        </div>

        <div>Regards,</div>

        <div>Gabor<br>

        </div>

        <br>

        <div class="gmail_quote">

          <div dir="ltr" class="gmail_attr">On Sat, 2 Mar 2019 at 00:11,

            Artem Dergachev <<a href="mailto:noqnoqneo@gmail.com" target="_blank">noqnoqneo@gmail.com</a>> wrote:<br>

          </div>

          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hey!<br>

            <br>

            First of all, yup, we're totally doing GSoC this year! I'll

            do my best<br>

            to finally force myself to update the open projects list

            soon with a<br>

            funny random idea that some folks around there are

            interested in :)<br>

            Now, also i'm alone these days since George changed jobs;

            Gabor, are<br>

            you interested in / capable of helping out again? 'Cause

            they'll most<br>

            likely only let me pick one student, while we already have a

            few<br>

            people looking into participating :)<br>

            <br>

            Data flow... Well, this one's definitely wanted, but that's

            a big one,<br>

            definitely bigger than any of the projects that we've tried

            so far :)<br>

            I guess such project would aim at allowing data-flow

            checkers that<br>

            don't have to write down transfer functions for every kind

            of<br>

            statement in every checker, but instead rely on the engine

            to handle<br>

            common basic effects of statements. If done properly, it

            pretty much<br>

            means writing a new Static Analyzer, which is more

            complicated than<br>

            the old one because it also needs its state merge operations

            defined.<br>

            <br>

            There may be "poor-man's" solutions to this, which would

            allow us to<br>

            have a bit of progress without developing too much complex

            machinery.<br>

            One such solution is to introduce careful (conservative)

            tracking of<br>

            dropped execution paths in the Static Analyzer; it would

            allow finding<br>

            bugs that require analysis of all paths as long as the

            Analyzer is<br>

            known to have explored the function in its entirety. In

            practice<br>

            that'd be a lot of dropped coverage (much more than a proper<br>

            solution), but it'd still probably find *something*, while

            allowing us<br>

            to re-use most of the infrastructure. This, of course, is a

            dead-end<br>

            in the long run - if we ever want to regain the lost

            coverage, we'd<br>

            have to start from scratch.<br>

            <br>

            So my overall feel here is that i'd love to hear a specific

            proposal,<br>

            but i suspect that if you go for doing this properly, one

            summer would<br>

            only be enough to scratch the surface. If you want to have a

            feel of<br>

            what it'd take, i'd recommend poking our existing data flow

            analyses.<br>

            Eg., what would it take to remove my C++ "object under

            construction"<br>

            liveness hacks by improving live variables analysis? You

            might also<br>

            want to write a few analyses of your own and see if you can

            generalize<br>

            something out of them. Eg., if you write an analysis that

            checks<br>

            whether a function is pure, we can instantly use it to

            improve our<br>

            conservative evaluation of pure functions (eg., produce the

            same<br>

            return value symbol every time it's called with the same

            parameters<br>

            and skip invalidations - we may even try to do it across

            translation<br>

            units because such summary is easy to serialize).<br>

          </blockquote>

        </div>

      </div>

    </blockquote>

    <br>

  </div>

</blockquote></div></div>