[PATCH] D113741: [RFC][DwarfDebug][AsmPrinter] Support emitting function-local declaration for a lexical block

David Blaikie via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 24 11:48:14 PST 2022


dblaikie added a comment.

In D113741#3343683 <https://reviews.llvm.org/D113741#3343683>, @krisb wrote:

> In D113741#3343496 <https://reviews.llvm.org/D113741#3343496>, @dblaikie wrote:
>
>> In D113741#3343268 <https://reviews.llvm.org/D113741#3343268>, @krisb wrote:
>>
>>> @dblaikie I've updated the patch with a fix for split dwarf issue (the test is included). Could I ask you to check the full build on your side?
>>>
>>> The issue was in wrong assumptions about in which CUs we need to create those local entities. Basically, they [CUs] should match the ones where the parent subprogram was emitted.
>>> It's not an easy thing to find the particular list of suited CUs while we are in `endModule()` (especially, because split-dwarf and split-dwarf inlining introduce a tricky logic around), so I chose to collect this information during subprograms emission.
>>> The solution still looks a bit hacky, but I hope this is a step forward.
>>>
>>> What about the second issue, unfortunately (or fortunately? :)), I'm still able to reproduce the crash on ToT of `main`, so I'm not sure this patch may cause the issue. I'll appreciate if could check it once again.
>>>
>>> Thank you and sorry for the long delay.
>>
>> All good.
>>
>> I'm not immediately following why a list of CUs is needed/why we'd create the variable in more than one CU - could you explain that?
>
> Well, this is because we may have a parent subprogram emitted in more than one CU (in the case of split dwarf), and we do not know in which one. 
> Here are the situations I see (hope I got them right):
> (1) If split dwarf disabled, we have a single list of subprogram's abstract origins and can reference them between CUs. So, we'll emit a concrete definition and its abstract origin (any existing) in the same CU. In this case, we have only one CU.
> (2) If we have split dwarf enabled, and `shareAcrossDWOCUs()` is false (it's false by default), every CU will maintain its own abstract origin's list. In this case, we cannot have references between CUs, and if we have cross-CU inlining, we'll create:
>
> - concrete def -> in the CU the subprogram is defined,
> - abstract origin -> in every CU the subprogram is inlined. For all local entities we should follow the same logic.

Do you have a test case for this (sorry, I realize it might be one of the test cases in the patch - I haven't looked/would appreciate some pointers)? My recollection is that without shareAcrossDWOCUs we should probably be only emitting one CU if using Split DWARF - Split DWARF doesn't really support multiple CUs in a single .dwo file (& we only support emitting a single .dwo for a given .o file) - but I guess maybe we do in full LTO mode, even if we don't do it in ThinLTO mode (ThinLTO's the only thing I've been especially concerned with/the only mode we use at Google)?

> (3) SplitDwarfInlining adds skeleton units to this list. Basically, we do not need to emit any local entities for skeleton units in this case, but I just reused already existing logic around `includeMinimalInlineScopes()`.
>
> The reproducer you provided is about the case (2). We tried to emit the imported entity in a CU, where its parent subprogram was defined, but the subprogram itself was emitted in the CU it was inlined.





================
Comment at: llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:1408-1409
+    }
+    DwarfCompileUnit *const *TheCUPtr = &TheCU;
+    return llvm::make_range(TheCUPtr, TheCUPtr + 1);
+  };
----------------
krisb wrote:
> dblaikie wrote:
> > krisb wrote:
> > > dblaikie wrote:
> > > > This looks buggy - it's returning a pointer to a local variable (the `TheCU` pointer) which means it'll be dangling once the function returns, yeah? (I'd expect msan to catch this, for instance)
> > > Uuups, bad experiments. Fixed.
> > This code still looks buggy, it's still returning the address of a local variable (the address of "TheU" pointer) so it'd still be UB/dangling/msan-unclean, I think?
> Hurry isn't a good way to do smth good. Sorry. One more try and I'll go to check this with msan.
That doesn't seem like a valid range either - that range would be from "TheCU" until nullptr - so a huge chunk of the address space?

I think now you need/could have "TheCU + 1" as the end of the range and that'd be correct, now that it's passed in a pointer-to-pointer.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113741/new/

https://reviews.llvm.org/D113741



More information about the llvm-commits mailing list