[PATCH] D149268: [LLD][COFF] Fix PDB relocation on big-endian hosts

Ulrich Weigand via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 27 08:23:44 PDT 2023


uweigand marked 3 inline comments as done.
uweigand added inline comments.


================
Comment at: lld/COFF/PDB.cpp:958
     auto unrelocatedRvaStart = subsecData.take_front(sizeof(uint32_t));
     uint8_t relocatedRvaStart[sizeof(uint32_t)];
     debugChunk->writeAndRelocateSubsection(debugChunk->getContents(),
----------------
aganea wrote:
> uweigand wrote:
> > aganea wrote:
> > > uweigand wrote:
> > > > aganea wrote:
> > > > > Can't we just write `support::ulittle32_t rvaStart;` here and static cast to a uint8_t* it below in the function call?
> > > > I guess that would be safe (in particular w.r.t. aliasing rules) if `uint8_t` is implemented as an `unsigned char`.  This is probably always true in practice, but may not be absolutely guaranteed by the C++ standard.  Maybe this is why the original code used a memcpy here?
> > > > 
> > > > Let me know if you want me to make that change.   (The generated code should be identical with all modern compilers anyway.)
> > > > 
> > > I feel this could help future maintainers of this code. The expectations should be clear at the caller site, and it is not quite clear now why storage for uint32_t has be allocated through a uint8_t, just to memcpy it back into a `ulittle32_t` after the call. This all feels quite strange. If we want to avoid the casting here maybe `writeAndRelocateSubsection` should take a `void*` and let it deal with casting internally?
> > I mean, that `memcpy` is a typical pattern to work around type-based aliasing violations potentially triggering undefined behavior, so my assumption is this is why the `memcpy` was used here.  (To be clear, the problem is not the cast as such, it's the aliasing enabled by it, and therefore it doesn't matter w.r.t. undefined behavior whether or not the cast is done in the caller or the callee.)
> > 
> > What would make it safe without a memcpy would be to change the type of the `buf` argument to a `char *` (or `unsigned char *`) in `writeAndRelocateSubsection` and its subroutines and access it as such consistently.  [ Or, as I said, simply make and accept the assumption that `uint8_t` is implemented as `unsigned char` on all supported platforms.  ]
> Indeed I forgot about all this type punning. Since all the other usages of `writeAndRelocateSubsection` don't need punning let's leave it like this. Should we add a short comment to that effect above memcpy in case it isn't obvious for everyone?
Fair enough; added a comment now.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149268/new/

https://reviews.llvm.org/D149268



More information about the llvm-commits mailing list