[PATCH] D64006: [Support] Support 64-bit offsets in DataExtractor.

Wed Jul 3 11:25:29 PDT 2019

dblaikie added a comment.

In D64006#1568874 <https://reviews.llvm.org/D64006#1568874>, @ikudrin wrote:

> In D64006#1568562 <https://reviews.llvm.org/D64006#1568562>, @dblaikie wrote:
>
> > I think the main one is code duplication - write once/fix once/etc is valuable.
>
>
> In that case, what do you think about templates?

They help, for sure - though I I'm not sure it's probably necessary to have even that complexity (having DataExtractor templated on an integer type for the cursor), at least I don't see it yet, maybe with better understanding I might - but for now it really sounds like we would've built this with uint64_t only if we started from scratch.

>>> Moreover, this is just a utility class. It should provide its users with functionality, not force them to satisfy its whims, especially if that complicates things for them without any noticeable value; I mean, storing 32-bit offsets and creating a temporary 64-bit variable just to call the utility class does not seem aesthetic, too. It is the callers who know how big their data is and which data type for offsets reflects their needs better.
>> 
>> I think if we knew then what we know now we'd have built it with 64 bit offsets & it wouldn't've been that much imposition to clients that can use 32 bit offsets.
>> 
>> Did you come across cases where you would need to insert new 64 bit temporaries? My understanding would be that anything storing a lot of offsets in a data structure would be storing fixed offsets that were not intended to be mutated (eg: the DIEs in a DIE tree might contain an offset to the start of the attributes in that DIE) - so you wouldn't want to pass the address of that offset directly to DataExtractor (because it would mutate it, then the DIE in the DIE tree would have an offset that no longer points to the start of the attributes - but somewhere in the middle/end - making it unusable from then on) - so such code would likely copy the offset into a local intended for mutation. The difference now is that local would be 64 bit. That doesn't seem like an imposition to me.
> 
> Seems you are right. We may come across some unusual cases during the transition, but we can solve them in the corresponding patches.
> 
>>> Anyway, as I said, my main intent is to add support for 64-bit DWARF. While there are other users, `DataExtractor` is mainly used in the `DebugInfo/DWARF` library and I expect to change it to use 64-bit offsets while implementing that support. Hopefully, that will be done with a bunch of relatively small patches. After that, we can decide what to do with the remaining callers. Unfortunately, I cannot promise that this transition may be done in several weeks. It seems that two to four months is a more realistic estimation.
>> 
>> Fair enough - though would it be possible to prioritize finishing the DataExtractor migration (or demonstrating it is not desirable) before necessarily fleshing out the rest of the DWARF 64 support? I'd be concerned it might be left lingering otherwise.
> 
> Well, in many cases they are connected. You need 64-bit offsets because they can be found in 64-bit DWARF sections. Thus, migrating to 64-bit offsets is a half-way to implement 64-bit DWARF in the `DebugInfo/DWARF` library, so, I think it is better to migrate class-by-class, adding DWARF64 support consciously, rather than just mechanical replacing 32-bit offsets with 64-bit ones. We have already seen that the mechanical approach does not work well.
> 
> Maybe we can postpone applying this patch until we have a whole set for the migration. Surely, it will require some efforts to keep all patches in the actual state, and I would be happy to avoid that. But this might be a reasonable way if you want the migration to be as fast as possible.

Nah, it's fine - happy to let you & @aprantl  carry on here - holding the patches out of tree doesn't buy us anything. The main concern is just that things aren't cleaned up & left in a hybrid state, and that risk exists even if we delay this going in-tree until some slightly later point.

To answer one of @aprantl 's later questions - nah, I don't think these need different names. The overloads are distinct, a pointer to uint32_t can't implicitly convert to/from a pointer to uint64_t - so there doesn't seem to be any great risk of confusion there.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64006/new/

https://reviews.llvm.org/D64006