[llvm-dev] [DebugInfo] RFC: Introduce LLVM DI Checker utility

Mon Jul 6 07:41:31 PDT 2020

Hi Vedant,

I've managed to merge this idea into the existing Debugify Pass by 
introducing two modes:

   1) synthetic - everything stay as is; this mode deals with the 
synthetic debug info

   2) original - this is the new mode; it checks debug info metadata 
preservation on real/original/-g debug info

The initial patches are: https://reviews.llvm.org/D82545 and 
https://reviews.llvm.org/D82546.

There are still TODOs (such as to cover the dbg.values tracking by using 
the idea you pointed out in a previous mail; or to cover metadata other 
then DILocations and DISubprograms (if there are any that make sense)).

Best regards,

Djordje

On 19.6.20. 09:33, Djordje wrote:
>
>
> On 19.6.20. 08:27, Vedant Kumar wrote:
>>
>>
>>> On Jun 18, 2020, at 1:58 AM, Djordje <djordje.todorovic at syrmia.com 
>>> <mailto:djordje.todorovic at syrmia.com>> wrote:
>>>
>>> Hi Vedant,
>>>
>>> Thanks a lot for your comments!
>>>
>>> >It looks like a lot of the new infrastructure introduced here 
>>> <https://github.com/djolertrk/llvm-di-checker/commit/9d26ac2557c584f6cf82ac5535fc47f8bd267a27> consists 
>>> of logic copied from the debugify implementation. Why is introducing 
>>> a new pair of passes better than extending the ones we have? The 
>>> core infrastructure needed to track location loss for real 
>>> (non-synthetic) source variables is is in place already.
>>>
>>> Since it is a proposal, I thought it'd easier to understand the idea 
>>> if I duplicate things. Ideally, we can make an API that could be 
>>> used from both tools. Initially, I made a few patches locally 
>>> turning the real debug info into debugify ones, but I realized it 
>>> breaks the original idea/design of the debugify & that is why I 
>>> decided to make a separate pass(es). This cannot stay as is with the 
>>> respect to the implementation, it should be either merged into 
>>> debugify file(s) or refactored using the API mentioned above.
>>>
>> Oh, this wasn’t clear to me. In the future, please describe what 
>> is/isn’t part of the proposal. It’d be especially helpful to include 
>> some discussion of the pros & cons of the prototype design and its 
>> alternatives.
>>
> Sorry If I wasn't clear enough. I thought if I write [PROPOSAL], all 
> of it is considered as a proposal.
>>>
>>> Another reason for implementing it as a different pass was the fact 
>>> the debugify is meant to be used from 'opt' level only, but if we 
>>> want to invoke the option from front end level, we need to merge it 
>>> into the IR library.
>>>
>> The debugify passes are now linked by llc via the Transforms library 
>> as part of the -mir-debugify implementation. A frontend should be 
>> able to use them.
> I'll try that, thanks.
>>
>>> >Stepping back a bit, I’m also surprised by the decision to move 
>>> away from synthetic testing when there’s still so much low-hanging 
>>> fruit to pick using that technique. The example from 
>>> https://reviews.llvm.org/D81939 illustrates this perfectly: in this 
>>> case it’s not necessary to invent a new testing technique to uncover 
>>> the bug, because simply running `./bin/llvm-lit -Dopt="opt 
>>> -debugify-each" test/Transforms/DeadArgElim` finds the same issue.
>>>
>>> As I mentioned in the previous mail, I do really think the debugify 
>>> technique is great & I use it. But, in order to detect that variable 
>>> "x" was optimized-out starting from pass Y, I only run the 
>>> di-checker option (that performs analysis only) & find the variable 
>>> in the final html report. I think that is very user friendly concept.
>>>
>> About the analysis — why not emit a report in -check-debugify when 
>> the # of non-undef debug uses of a variable drops to 0? This doesn’t 
>> require the -debugify step. The list of local variables is preserved 
>> via DISubprogram::retainedNodes.
>
> Hmm, good idea. But we'd need to workaround the fact/condition the 
> debugify(& check-debugify) works with/expects the synthetic debug info 
> only. Let me try merging these ideas into the code, by removing the 
> duplicated code (I'll try to use the debugify/check-debugify as much 
> as possible by performing analysis only).
>
>>
>>> At the end, when we detected what was the spot of loosing the 
>>> location, we can run debugify on the pass-directory-tests (but there 
>>> is a concern the tests does not cover all the possible cases; and 
>>> the case found from the high level could be new to the pass). In 
>>> addition, the di-checker detects issues for metadata other than 
>>> locations (currently, the preservation map keeps the disubprograms 
>>> only, but it should keep other kinds too).
>>>
>> Note that it’s possible to to increase code coverage by running a 
>> -debugify-each pipeline on -g0 IR produced by a frontend.
>>
>> Is it common for a pass to drop an entire DISubprogram? I would hope 
>> this either never happens or is extremely rare.
>
> It is extremely rare, but there are passes that create new functions, 
> and it possible to forget to update/attach the subprogram on that (the 
> similar situation we face with locations when someone forgets to set 
> debugloc on an instruction).
>
>
> Best,
>
> Djordje
>
>>
>> best,
>> vedant
>>
>>> >In D81939, you discuss finding the new tool useful when responding 
>>> to bug reports about optimized-out variables or missing locations. 
>>> We sorely do need something better than -opt-bisect-limit, but why 
>>> not start with something simple? -check-debugify already knows how 
>>> to report when & where a location is dropped, it would be simple to 
>>> teach it to emit a report when a variable is fully optimized-out.
>>>
>>> I agree. We can do that and that could be used from both utilities.
>>>
>>>
>>> Best regards,
>>>
>>> Djordje
>>>
>>>
>>> On 17.6.20. 21:14, Vedant Kumar wrote:
>>>> Hey Djordje,
>>>>
>>>> It looks like a lot of the new infrastructure introduced here 
>>>> <https://github.com/djolertrk/llvm-di-checker/commit/9d26ac2557c584f6cf82ac5535fc47f8bd267a27> consists 
>>>> of logic copied from the debugify implementation. Why is 
>>>> introducing a new pair of passes better than extending the ones we 
>>>> have? The core infrastructure needed to track location loss for 
>>>> real (non-synthetic) source variables is is in place already.
>>>>
>>>> Stepping back a bit, I’m also surprised by the decision to move 
>>>> away from synthetic testing when there’s still so much low-hanging 
>>>> fruit to pick using that technique. The example from 
>>>> https://reviews.llvm.org/D81939 illustrates this perfectly: in this 
>>>> case it’s not necessary to invent a new testing technique to 
>>>> uncover the bug, because simply running `./bin/llvm-lit -Dopt="opt 
>>>> -debugify-each" test/Transforms/DeadArgElim` finds the same issue.
>>>>
>>>> In D81939, you discuss finding the new tool useful when responding 
>>>> to bug reports about optimized-out variables or missing locations. 
>>>> We sorely do need something better than -opt-bisect-limit, but why 
>>>> not start with something simple? -check-debugify already knows how 
>>>> to report when & where a location is dropped, it would be simple to 
>>>> teach it to emit a report when a variable is fully optimized-out.
>>>>
>>>>
>>>>> On Jun 17, 2020, at 2:10 AM, Djordje <djordje.todorovic at syrmia.com 
>>>>> <mailto:djordje.todorovic at syrmia.com>> wrote:
>>>>>
>>>>> I am sharing the proposal [0] which gives a brief introduction for 
>>>>> the implementation of the LLVM DI Checker utility. On a very high 
>>>>> level, it is a pair of LLVM (IR) Passes that check the 
>>>>> preservation of the original debug info in the optimizations. 
>>>>> There are options controlling the passes, that could be invoked 
>>>>> from ``clang`` as well as from ``opt`` level.
>>>>>
>>>>> By testing the utility on the GDB 7.11 project (using it as a 
>>>>> testbed), it has found a certain number of potential issues 
>>>>> regarding the DILocations (using it on LLVM project build itself, 
>>>>> it has found one bug regarding DISubprogram metadata). Please take 
>>>>> a look into the final report (on the GDB 7.11 testbed) generated 
>>>>> from the script that collects the data at [1]. By looking at these 
>>>>> data, it looks that the utility like this could be useful when 
>>>>> trying to detect the real issues related to debug info production 
>>>>> by the compiler.
>>>>
>>>> Thanks for sharing these results. The data here is older (from the 
>>>> 2018 debug info BoF) and from a different project (sqlite3), but we 
>>>> saw some similar patterns: 
>>>> https://llvm.org/devmtg/2018-10/slides/Prantl-Kumar-debug-info-bof-2018.pdf
>>>>
>>>> best
>>>> vedant
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200706/74a71557/attachment-0001.html>