[llvm-dev] [GSoC 2016] Capture Tracking Improvements - Mid term report

Sat Jun 18 11:43:48 PDT 2016

Hi Sanjoy,

On 17/06/16 22:27, Sanjoy Das wrote:
> Hi Scott,
>
> On Fri, Jun 17, 2016 at 11:54 AM, Scott Egerton via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> Over the past two weeks I have been learning a lot more about capture
>> tracking. From this I was able to instrument the current implementation in
>> order to identify some of the false positives in it. I was hoping to have
>
> How are you instrumenting the analysis to identify false positives?

I've instrumented it so that it will output the pointer value to 
llvm::dbgs() whenever MayBeCaptured or MayBeCapturedBefore returns true 
in order to identify all "positives", false or not. Then I am manually 
filtering out all of cases which appear to be the correct behaviour. 
Hopefully everything left over will be a false positive.
>
>> more definitive results by now than what I currently have, but due to some
>> unforeseen issues with the output becoming scrambled I cannot provide this
>> just yet. I have now resolved this and plan to have some more concrete
>> results within the coming week.
>>
>> The current algorithm for Capture tracking will look through the uses of a
>> pointer. If there are too many uses it will conservatively say that the
>> value is captured to avoid taking up too much compile time. If not it will
>> then determine whether or not the Value is captured in various ways based on
>> its opcode.
>> However there are some deficiencies with the current implementation. Such as
>> the false positives and the fact that it takes rather a lot of compile time
>> to run.
>>
>> I am currently compiling a large piece of code with an instrumented version
>
> Can you give some more details here?  What is this large piece of
> code?

It's the code for Firefox. I saw someone on IRC was using it as well and 
thought it would be good as it is supported well enough to "just work".
>
>> of LLVM in order to identify false positives. I then plan to fix all the
>> identified false positives. After this is complete I will be be moving onto
>
> This sounds reasonable, but I'd say it is better to avoid this kind of
> workflow:
>
>    for (every_false_positive)
>      understand_false_positive;
>    for (every_false_positive)
>      fix_false_positive;
>
> but have it be more like:
>
>    for (every_false_positive) {
>      understand_false_positive;
>      fix_false_positive;
>    }
>
> i.e. you don't have to understand _all_ of the cases where our capture
> tracking is too conservative to make it better.  Find one _specific_
> case where LLVM today is stupid today, and fix it; and iterate.
> Making these kind of small changes will also increase the trust the
> community has in you, which will be helpful when you start proposing
> bigger changes (e.g. perhaps for non-escaping subgraphs).

I agree that this does seem like a better workflow, however I am finding 
it difficult to identify a single false positive without at least 
finding a few because of the process I'm using, filtering the 
instrumented output.
>
> -- Sanjoy
>

Many thanks,
Scott