[llvm-dev] [DebugInfo] RFC: Introduce LLVM DI Checker utility

Fri Jun 19 00:33:44 PDT 2020

On 19.6.20. 08:27, Vedant Kumar wrote:
>
>
>> On Jun 18, 2020, at 1:58 AM, Djordje <djordje.todorovic at syrmia.com 
>> <mailto:djordje.todorovic at syrmia.com>> wrote:
>>
>> Hi Vedant,
>>
>> Thanks a lot for your comments!
>>
>> >It looks like a lot of the new infrastructure introduced here 
>> <https://github.com/djolertrk/llvm-di-checker/commit/9d26ac2557c584f6cf82ac5535fc47f8bd267a27> consists 
>> of logic copied from the debugify implementation. Why is introducing 
>> a new pair of passes better than extending the ones we have? The core 
>> infrastructure needed to track location loss for real (non-synthetic) 
>> source variables is is in place already.
>>
>> Since it is a proposal, I thought it'd easier to understand the idea 
>> if I duplicate things. Ideally, we can make an API that could be used 
>> from both tools. Initially, I made a few patches locally turning the 
>> real debug info into debugify ones, but I realized it breaks the 
>> original idea/design of the debugify & that is why I decided to make 
>> a separate pass(es). This cannot stay as is with the respect to the 
>> implementation, it should be either merged into debugify file(s) or 
>> refactored using the API mentioned above.
>>
> Oh, this wasn’t clear to me. In the future, please describe what 
> is/isn’t part of the proposal. It’d be especially helpful to include 
> some discussion of the pros & cons of the prototype design and its 
> alternatives.
>
Sorry If I wasn't clear enough. I thought if I write [PROPOSAL], all of 
it is considered as a proposal.
>>
>> Another reason for implementing it as a different pass was the fact 
>> the debugify is meant to be used from 'opt' level only, but if we 
>> want to invoke the option from front end level, we need to merge it 
>> into the IR library.
>>
> The debugify passes are now linked by llc via the Transforms library 
> as part of the -mir-debugify implementation. A frontend should be able 
> to use them.
I'll try that, thanks.
>
>> >Stepping back a bit, I’m also surprised by the decision to move away 
>> from synthetic testing when there’s still so much low-hanging fruit 
>> to pick using that technique. The example from 
>> https://reviews.llvm.org/D81939 illustrates this perfectly: in this 
>> case it’s not necessary to invent a new testing technique to uncover 
>> the bug, because simply running `./bin/llvm-lit -Dopt="opt 
>> -debugify-each" test/Transforms/DeadArgElim` finds the same issue.
>>
>> As I mentioned in the previous mail, I do really think the debugify 
>> technique is great & I use it. But, in order to detect that variable 
>> "x" was optimized-out starting from pass Y, I only run the di-checker 
>> option (that performs analysis only) & find the variable in the final 
>> html report. I think that is very user friendly concept.
>>
> About the analysis — why not emit a report in -check-debugify when the 
> # of non-undef debug uses of a variable drops to 0? This doesn’t 
> require the -debugify step. The list of local variables is preserved 
> via DISubprogram::retainedNodes.

Hmm, good idea. But we'd need to workaround the fact/condition the 
debugify(& check-debugify) works with/expects the synthetic debug info 
only. Let me try merging these ideas into the code, by removing the 
duplicated code (I'll try to use the debugify/check-debugify as much as 
possible by performing analysis only).

>
>> At the end, when we detected what was the spot of loosing the 
>> location, we can run debugify on the pass-directory-tests (but there 
>> is a concern the tests does not cover all the possible cases; and the 
>> case found from the high level could be new to the pass). In 
>> addition, the di-checker detects issues for metadata other than 
>> locations (currently, the preservation map keeps the disubprograms 
>> only, but it should keep other kinds too).
>>
> Note that it’s possible to to increase code coverage by running a 
> -debugify-each pipeline on -g0 IR produced by a frontend.
>
> Is it common for a pass to drop an entire DISubprogram? I would hope 
> this either never happens or is extremely rare.

It is extremely rare, but there are passes that create new functions, 
and it possible to forget to update/attach the subprogram on that (the 
similar situation we face with locations when someone forgets to set 
debugloc on an instruction).

Best,

Djordje

>
> best,
> vedant
>
>> >In D81939, you discuss finding the new tool useful when responding 
>> to bug reports about optimized-out variables or missing locations. We 
>> sorely do need something better than -opt-bisect-limit, but why not 
>> start with something simple? -check-debugify already knows how to 
>> report when & where a location is dropped, it would be simple to 
>> teach it to emit a report when a variable is fully optimized-out.
>>
>> I agree. We can do that and that could be used from both utilities.
>>
>>
>> Best regards,
>>
>> Djordje
>>
>>
>> On 17.6.20. 21:14, Vedant Kumar wrote:
>>> Hey Djordje,
>>>
>>> It looks like a lot of the new infrastructure introduced here 
>>> <https://github.com/djolertrk/llvm-di-checker/commit/9d26ac2557c584f6cf82ac5535fc47f8bd267a27> consists 
>>> of logic copied from the debugify implementation. Why is introducing 
>>> a new pair of passes better than extending the ones we have? The 
>>> core infrastructure needed to track location loss for real 
>>> (non-synthetic) source variables is is in place already.
>>>
>>> Stepping back a bit, I’m also surprised by the decision to move away 
>>> from synthetic testing when there’s still so much low-hanging fruit 
>>> to pick using that technique. The example from 
>>> https://reviews.llvm.org/D81939 illustrates this perfectly: in this 
>>> case it’s not necessary to invent a new testing technique to uncover 
>>> the bug, because simply running `./bin/llvm-lit -Dopt="opt 
>>> -debugify-each" test/Transforms/DeadArgElim` finds the same issue.
>>>
>>> In D81939, you discuss finding the new tool useful when responding 
>>> to bug reports about optimized-out variables or missing locations. 
>>> We sorely do need something better than -opt-bisect-limit, but why 
>>> not start with something simple? -check-debugify already knows how 
>>> to report when & where a location is dropped, it would be simple to 
>>> teach it to emit a report when a variable is fully optimized-out.
>>>
>>>
>>>> On Jun 17, 2020, at 2:10 AM, Djordje <djordje.todorovic at syrmia.com 
>>>> <mailto:djordje.todorovic at syrmia.com>> wrote:
>>>>
>>>> I am sharing the proposal [0] which gives a brief introduction for 
>>>> the implementation of the LLVM DI Checker utility. On a very high 
>>>> level, it is a pair of LLVM (IR) Passes that check the preservation 
>>>> of the original debug info in the optimizations. There are options 
>>>> controlling the passes, that could be invoked from ``clang`` as 
>>>> well as from ``opt`` level.
>>>>
>>>> By testing the utility on the GDB 7.11 project (using it as a 
>>>> testbed), it has found a certain number of potential issues 
>>>> regarding the DILocations (using it on LLVM project build itself, 
>>>> it has found one bug regarding DISubprogram metadata). Please take 
>>>> a look into the final report (on the GDB 7.11 testbed) generated 
>>>> from the script that collects the data at [1]. By looking at these 
>>>> data, it looks that the utility like this could be useful when 
>>>> trying to detect the real issues related to debug info production 
>>>> by the compiler.
>>>
>>> Thanks for sharing these results. The data here is older (from the 
>>> 2018 debug info BoF) and from a different project (sqlite3), but we 
>>> saw some similar patterns: 
>>> https://llvm.org/devmtg/2018-10/slides/Prantl-Kumar-debug-info-bof-2018.pdf
>>>
>>> best
>>> vedant
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200619/ca5cbd87/attachment.html>