[llvm-dev] [llvm-pdbutil] : merge not working properly

Vivien Millet via llvm-dev llvm-dev at lists.llvm.org
Thu Jan 17 10:23:57 PST 2019


Ok I see..
what do you mean by “making sure to de-duplicate records as necessary” ?

Le jeu. 17 janv. 2019 à 19:09, Zachary Turner <zturner at google.com> a écrit :

> It's possible in theory to support incremental updates to a PDB (the file
> format is designed specifically with that in mind).  But this functionality
> was never added to the PDB library since lld doesn't support incremental
> linking, we never really needed it.
>
> The "dumb" way would be to just create a new PDB file, build it using the
> old contents and the new contents (making sure to de-duplicate records as
> necessary).
>
> Supporting incremental updates should be possible, but most of LLVM's File
> I/O abstractions are based around mmapping a file and writing to it, which
> doesn't work when you don't know the file size in advance.  So there would
> be some interesting problems to solve here.
>
> On Thu, Jan 17, 2019 at 10:03 AM Vivien Millet <vivien.millet at gmail.com>
> wrote:
>
>> Hi Zachary !
>> If there a way to easily create a new PDBFileBuilder from an existing
>> PDBFile or can/should I do the translation myself ?
>> I would like to start from a builder filled with the EXE PDB data and
>> then complete its DBI stream with the JIT module/symbols.
>>
>> Thanks !
>>
>>
>> Le mer. 16 janv. 2019 à 23:41, Vivien Millet <vivien.millet at gmail.com> a
>> écrit :
>>
>>> Thank you Zachary !
>>> I will have some soon I think ..
>>> I first need to explore the llvmpdb-util code more because I don't even
>>> know where to start with the PDB api..
>>>
>>> Le mer. 16 janv. 2019 à 22:51, Zachary Turner <zturner at google.com> a
>>> écrit :
>>>
>>>> Sure. Along the way I’m happy to answer any specific questions you
>>>> might have too even if it’s for your downstream project
>>>> On Wed, Jan 16, 2019 at 1:38 PM Vivien Millet <vivien.millet at gmail.com>
>>>> wrote:
>>>>
>>>>> I would be up to improve pdbutil but I doubt I have enough knowledge
>>>>> or time to provide the complete merge feature, it would still be a very
>>>>> specific kind of merge as you describe it. Anyway I could start trying to
>>>>> do it in my jit compiler and then, once I get something working (if that
>>>>> happens :)), i can come back to you with the piece of code and see if it is
>>>>> worth integrating it to pdbutil and how ?
>>>>>
>>>>> Le mer. 16 janv. 2019 à 22:12, Zachary Turner <zturner at google.com> a
>>>>> écrit :
>>>>>
>>>>>> Well, that’s certainly possible, but improving llvm-pdbutil is
>>>>>> another possibility. Doing it directly in your jit compiler will probably
>>>>>> save you time though, since you won’t have to worry about writing tests and
>>>>>> going through code review
>>>>>> On Wed, Jan 16, 2019 at 1:01 PM Vivien Millet <
>>>>>> vivien.millet at gmail.com> wrote:
>>>>>>
>>>>>>> Thanks for the tips !
>>>>>>> When you talk about doing all of this I suppose you think about
>>>>>>> using llvm/debuginfo/pdb, pick code here and there to generate the pdb in
>>>>>>> memory, read the executable one and perform the merge directly in my jit
>>>>>>> compiler, right ? Not using pdbutil ?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Le mar. 15 janv. 2019 à 22:49, Zachary Turner <zturner at google.com>
>>>>>>> a écrit :
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jan 15, 2019 at 2:50 AM Vivien Millet <
>>>>>>>> vivien.millet at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hello Zachary !
>>>>>>>>> Thanks for your time !
>>>>>>>>> So you are one of the happy guys who suffered from the lack of PDB
>>>>>>>>> format information :)
>>>>>>>>>
>>>>>>>> Yes, that would be me :)
>>>>>>>>
>>>>>>>>
>>>>>>>>> To be honest I'm really a beginner in the PDB stuff, I just read
>>>>>>>>> some llvm documentation to understand what went wrong when merging my PDBs.
>>>>>>>>> In my case, what I do with my team and try to achieve is this :
>>>>>>>>> - Run our application under a visual studio debugger
>>>>>>>>> - Generate JIT code ( using llvm MCJIT  )
>>>>>>>>> - Then, either :
>>>>>>>>>    - export as COFF obj file with dwarf information and then
>>>>>>>>> convert it with cv2pdb to obtain a pdb of my JIT symbols (what I do now)
>>>>>>>>>    - export directly to PDB my JIT debug info (what i would like
>>>>>>>>> to do, if you have an idea how..)
>>>>>>>>> - Detach the visual studio debugger
>>>>>>>>> - Merge my JIT pdb into a copy of the executable pdb (where things
>>>>>>>>> start to go bad..)
>>>>>>>>> - Replace original executable by the copy (creating a backup of
>>>>>>>>> original)
>>>>>>>>> - Reattach  the visual studio debugger to my executable (loading
>>>>>>>>> the new pdb version)
>>>>>>>>> - Debug JIT code with visual studio.
>>>>>>>>> - On each JIT rebuild, restart these steps from the original
>>>>>>>>> native executable PDB to avoid merge conflict between the multiple JIT
>>>>>>>>> iterations
>>>>>>>>>
>>>>>>>> Yea, it's an interesting use case.  It makes me think it would be
>>>>>>>> nice if the PDB format supported some way of having a symbol which simply
>>>>>>>> refers to another PDB file, that way you could re-write that PDB file at
>>>>>>>> runtime once all your code is jitted, and when the debugger tries to look
>>>>>>>> up that symbol, it finds a record that tells it to go check the other PDB
>>>>>>>> file.
>>>>>>>>
>>>>>>>> So, here are the things I think you would need to do:
>>>>>>>>
>>>>>>>> 1) Create a JIT module in the module list with a unique name.  All
>>>>>>>> symbols will go here.  llvm-pdbutil dump -modules shows you the list.  Be
>>>>>>>> careful about putting it at the end though, because there's already one at
>>>>>>>> the end called * LINKER * that is kind of special.  On the other hand, you
>>>>>>>> don't want to put it first because it means you will have to do lots of
>>>>>>>> fixups on the EXE PDB.  It's probably best to add it right before the
>>>>>>>> linker module, this has the least chance of breaking anything.
>>>>>>>>
>>>>>>>> 2) In the debug stream for this module, add all symbols.  You will
>>>>>>>> need to fix up their type indices.  As you noticed, llvm-pdbutil already
>>>>>>>> merges type information from the JIT PDB, so after merging the type indices
>>>>>>>> in the EXE PDB will be different than they were in the JIT PDB, but the
>>>>>>>> symbol records will refer to the JIT PDB type indices.  So these need to be
>>>>>>>> fixed up.  LLD already has code to do this, you can probably borrow a
>>>>>>>> similar algorithm with some slight modifications (lldb/COFF/PDB.cpp, search
>>>>>>>> for mergeSymbolRecords)
>>>>>>>>
>>>>>>>> 3) Merge in the new section contributions and section map.  See LLD
>>>>>>>> again for how to modify these.  Hopefully the object file you exported
>>>>>>>> contains relocated symbol addresses so you don't have to do any fixups here.
>>>>>>>>
>>>>>>>> 4) Merge in the publics and globals.  This shouldn't be too hard, I
>>>>>>>> think you can just iterate over them in the JIT PDB and add them to the new
>>>>>>>> EXE PDB.
>>>>>>>>
>>>>>>>> You're kind of in uncharted territory here, so this is just a rough
>>>>>>>> idea of what needs to be done.  There may be other issues that you don't
>>>>>>>> encounter until you actually try it out.
>>>>>>>>
>>>>>>>> Unfortunately I don't personally have the time to work on this, but
>>>>>>>> it sounds neat, and I'm happy to help if you run into questions or problems
>>>>>>>> along the way.
>>>>>>>>
>>>>>>>>
>>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190117/717efb0b/attachment.html>


More information about the llvm-dev mailing list