[Lldb-commits] [PATCH] D62505: Fix multiple module loaded with the same UUID

Wed Jun 5 01:48:11 PDT 2019

The "overlapping sections" thread reminding me I should reply here.. It 
also reminded me of another possible use case for multiply-mapped 
sections. Thread-local sections (.tbss, .tdata in elf) contain data 
which is somehow (the exact mechanisms are still quite opaque to me) 
mapped memory for each thread. I am not sure what we do about these 
things now, but this also sounds like a thing that could/should be 
modelled as a single section being loaded multiple times.

On 31/05/2019 00:23, Jim Ingham wrote:
> I agree that the platform path has to be held by the target, not by the module.  In your example lldb would love to have a single local copy of the executable so it can do reads locally, which makes it clear that there are real cases where it would be obvious two targets should share the same Module, but would need to have different remote names for the it.
> 
> I don't see how extending the Platform Path in the module to be a vector can handle this situation.  We have tried to keep Modules from knowing about Targets, but the only way to make sense of all the platform paths associated with the Module is to know which one is used by a given Target.  So you'd really have to have a pair of Name & Target.  That argues that the Module was the wrong place for this information to begin with.
> 
> You could imagine doing this by having a map of platform path -> module in the target, but then there are all sorts of places you'd have to be careful to consult this, and that seems fallible.
Yes, this sounds pretty sub-optimal.

> 
> It really sounds like the Target should be dealing with TargetModules (a pair of Module & PlatformPath) and not straight Modules.  Probably for the sake of reducing changes we could have "CacheModules" which are what Modules are today, and Modules that are the pair of PlatformPath and Module.  I think this is what Greg was talking about with his BaseModule class.

I think I like the direction of this. This additional separation might 
also help with another I've been having with how modules work. Right now 
if we call Module::SetSymbolFileSpec, it will construct a new 
SymbolFile, but then still keep the old one around just in case somebody 
uses it. It would be cleaner if "TargetModule::SetSymbolFileSpec" could 
just create a new instace of "CacheModule" with the new symbol file, and 
just drop the reference to the old CacheModule.

And then we could avoid storing a vector of "platform paths" in the 
TargetModule by just creating two TargetModules -- since they would be 
sharing the same CacheModule, this wouldn't cost much.

> 
> Then if you want to support the same CacheModule being reused in a given target, the SectionLoadList  would have to hold a pair of Section/TargetModule - a TargetSection? - so it would know which instance of the module was meant.  I think you also have to do the same thing with Address.  It currently holds a Section, but that isn't fully specified in the case where the Module can appear twice.  It would have to have a TargetSection instead.  Except that you can pull Address's out of Modules w/o going through a Target.  So again you might have to make a distinction between Address and TargetAddress, which might get messy.

This might get messy, but it also might enable us to clean some things 
up and/or strengthen some invariants. For instance, if we call the 
TargetSections "LoadedSections" and create them only when they are 
loaded into a target, then we can maintain the invariant that a 
LoadedSection can always resolve itself to a "load address" (right now 
we can do that only if a section happens to be loaded), and to a 
"CacheSection".

The Address class can now hold two kinds of addresses: a load address 
and a section+offset combo. For converting between the two one has to go 
through a target. I think it would make sense for the conversion process 
to return something of a different type, as then one would always know 
which kind of address he is talking about. This might also be 
interesting for the address space folks, as the address space may need 
to be represented differently for the two address kinds. for a 
"sectioned address" the address space can probably be given implicitly 
via the section, but a "load address" would need to store the address 
space explicitly.

pl