[llvm-dev] [MCJIT] messy call stack debug on x64 code in VisualStudio

Vivien Millet via llvm-dev llvm-dev at lists.llvm.org
Thu Mar 5 14:35:49 PST 2020


I would still try :) because having a backing DLL or not doesn't matter
here, the only thing that matters is if VisualStudio finds a FunctionTable
(RtlAddFunctionTable / RtlLookupFunctionEntry api). This post explains it :
https://stackoverflow.com/a/58227575/809199.
Having a backing dll is harder because it requires that every .pdata and
.xdata sections are large enough to receive jitted unwind info, which is
easier in case of dynamic allocated memory not when you are a newbie in DLL
format knowledge like me !

Here it is, I just droped it !
https://github.com/vlmillet/llvmjitpdb
A review from @Zachary Turner <zturner at google.com>  and one of you would be
welcome !
I've reworked it so that it depends only on LLVM libraries and made it
inside the llvm:: namespace
I don't know if it is the right way to distribute an extension to LLVM...
That would be easier for the users if it could be integrated directly to
the LLVM solution but I don't know what is the process
for validation of such an integration.

Just in case someone wants to implement something similar without a hacked
DLL (which means no PDB debugging, or at least I don't know how). :
The steps I would follow are these:
- use *VirtualAlloc *to allocate memory outside of any module range (using
a custom MemoryManager).
- keep track of   *".xdata"* and   *".pdata" sections allocations in *
*MemoryManager::allocateDataSection*
- *fix *(if not already ok)* each* *RUNTIME_FUNCTION::UnwindData* in .pdata
so that they point inside .xdata allocated memory range (newCurrDataPos =
(oldCurrDataPos - oldFirstDataPos) + VirtualAllocStart).
- call *RtlAddFunctionTable *with "BaseAddress= VirtualAllocStart " and
"FunctionTable=.pdata address".

Kind regards,
Vivien

Le mer. 4 mars 2020 à 16:45, Jameson Nash <vtjnash at gmail.com> a écrit :

> I think it's not enough, for reasons related to the pain you went through.
> Normally, the JIT doesn't have a backing DLL and so it doesn't support the
> relocation type required by the xdata and pdata sections. As you say, it
> generally doesn't work very well in WinDBG anyways, since they removed the
> use of RBP to walk the stack. At the time I wrote my hack, LLVM didn't even
> have emission code for those unwind sections, and it hasn't been worth the
> hard effort to change it for me. Is your code public somewhere? It seems
> like it could be useful for all JIT users to have a drop-in option to
> enable debugging.
>
> On Tue, Mar 3, 2020 at 6:06 PM Vivien Millet <vivien.millet at gmail.com>
> wrote:
>
>> Sorry for being late to reply, I was investigating what you advised to
>> me. And with all your informations and some (really) hard time I succeeded
>> to register the unwinding information (not the way you think) and have a
>> clear callstack ! it feels good !
>>
>> @Reid Kleckner <rnk at google.com>
>> After looking deeper, I don't understand why this bug is open for such a
>> long time as everything is here to fix it : the unwinding process is
>> completely implemented in LLVM and is enabled by adding
>> llvm::Attribute::UWTable to llvm::Function instances.
>> The only thing missing is the calls to RtlAddFunctionTable. They should
>> be added to the default MCJIT or OrcJIT memory manager (maybe with an
>> option to enable/disable them), simply by tracking memory requested for
>> allocation of ".pdata" and ".xdata" sections and calling
>> RtlAddFunctionTable on notifyObjectLoaded. I don't know who is responsible
>> for this, but that might be an easy win for a great feature completion
>> (this is nice to avoid the user having to understand all this machinery by
>> themselves...)
>>
>> In my case, I can't call RtlAddFunctionTable because I inject my code
>> into a fake .DLL (for PDB hotreload purpose) which already have its static
>> function table.
>> To explain my process (for other devs willing to suffer like I did) :
>>  If debugging is required by the user, I switch from MCJITMemoryManager
>> to a home-made DllMemoryManager which :
>> - loads a dummy .DLL consisting of empty .text .rdata (including .xdata)
>> and .pdata sections, without relocations.
>> - allocates memory inside the loaded .DLL address range (the dll has been
>> hacked for WRITE access)
>> - unload the .DLL
>> - generate a PDB
>> - *INTERESTING PART* : rewrites PDATA and XDATA sections with the one
>> emitted by LLVM (fixing virtual address inside the image).
>> - writes the .DLL back on disk
>> - make PDB file match .DLL file (GUID)
>> - reload the .DLL (it reloads at same position 100% of the time in my
>> case, I might be lucky but I'm ok with it) so that visual studio detects it
>> and load the matching PDB.
>>
>> All of this process was painful but it works and I can build and rebuild
>> my language on-the-fly while debugging it alongside with native code inside
>> Visual Studio. You might wonder why not generate a real .dll and reload it
>> ? Because I can't predict where the functions will be reloaded inside
>> memory and I need to keep "reflection" of the JIT symbols. (+ it's slower
>> and requires a lot of work on the mangle/link/pdb dependency sides).
>>
>> @Jameson : Thanks for your feedback, it helped me to identify .xdata and
>> .pdata sections for unwind stuff !
>> Are you sure that unwinding info is not enough for you to make it
>> debuggable ? I personally removed the "no-frame-pointer-elim" and it keeps
>> working well, I keep seeing my full callstack (maybe is it only useful on
>> x86 ?), because Win64 does not use RBP to walk the stack at all, all is
>> done with unwind infos apparently.
>> PDB are not concerned at all with all of this, I thought it might but
>> no...
>>
>>
>> Le dim. 1 mars 2020 à 05:54, Jameson Nash <vtjnash at gmail.com> a écrit :
>>
>>> I've always just hacked support for this in to the various JITs (for
>>> JuliaLang, in our debuginfo.cpp file), by setting the
>>> no-frame-pointer-optim flag in the IR, then creating and populating a dummy
>>> unwind description object in the .text section, and registering that
>>> dynamically. Some day I hope to actually just register the .pdata/.xdata
>>> sections with the unwinder.
>>>
>>> PDBs are a bit different though, since the above steps work well for
>>> gdb, but generally I find that WinDbg is less willing or able to be given
>>> JIT-frame information from LLVM. (I assume somehow it can be done, for
>>> dotNET. I just don't know how.)
>>>
>>> On Sat, Feb 29, 2020 at 11:07 PM Reid Kleckner via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Yes, I think https://bugs.llvm.org/show_bug.cgi?id=24233 needs to be
>>>> implemented to fix this.
>>>>
>>>> The Windows x64 unwinder doesn't generally look at frame pointers. We
>>>> would need to register unwind info to make this work. What you see is
>>>> fairly typical of attempting to unwind the stack when unwind info is
>>>> missing.
>>>>
>>>> PDBs shouldn't generally enter into the picture.
>>>>
>>>> On Sat, Feb 29, 2020 at 8:14 AM Vivien Millet via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I'm using IR and MCJIT to compile a script language. I debug it with
>>>>> on the fly generated .pdb files. During debugging, almost each time I step
>>>>> into a function, I loose information about calling function inside the
>>>>> visual studio callstack view or I have a bunch of pure addresses in the
>>>>> callstack in between the current function and the calling function, for
>>>>> example :
>>>>>
>>>>> MyJit.dll!MyCurrentFunction()
>>>>> [0x1234567887654321]
>>>>> [0x8765432112345678]
>>>>> MyJit.dll!MyCallingFunction()
>>>>> ...
>>>>>
>>>>> It looks like visual studio get lost while walking up stack.
>>>>> Does anyone know where it could come from ?
>>>>>
>>>>> I have disabled all optimisations (among them is the
>>>>> omit-frame-pointer).
>>>>>
>>>>> I have seen this bug here :
>>>>> https://bugs.llvm.org/show_bug.cgi?id=24233 which is quite similar
>>>>> but it is quite old now, and since the proposed patch has been posted, the
>>>>> code in RuntimeDyldCOFFX86_64.h has changed and it is difficult for me to
>>>>> know if it has really been fixed since or not.
>>>>>
>>>>> Could it be related to the way IR CreateAlloca are used to build local
>>>>> variables ? Could it be related to missing informations inside the PDB ? (I
>>>>> don't know if there is stack related information inside PDB files to ensure
>>>>> good stack walking).
>>>>>
>>>>> Thanks.
>>>>>
>>>>> Vivien
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200305/c1e94d0f/attachment.html>


More information about the llvm-dev mailing list