[PATCH] D148042: [MSP430] Get the DWARF pointer size from MCAsmInfo instead of DataLayout.

Wed Apr 19 14:12:46 PDT 2023

dblaikie added a comment.

In D148042#4278656 <https://reviews.llvm.org/D148042#4278656>, @asl wrote:

> @dblaikie as far as I can see, there are several interconnected issues here.
>
> First of all, ELF could be 32-bit or 64-bit (EI_CLASS field in ELF header), there are no 16-bit ELF files defined around and I do not think it's a good idea to invent something new here.

Agreed.

> Next, LLDB indeed assumes that pointer size coincides with ELF class and I would not consider it as a bug :)

I would - DWARF has a field for pointer size & it doesn't have to be/isn't tied to the ELF class.

> And it seems there is no way to specify that inside ELF file that ELF32 file could contain 16-bit pointers.

There is at the DWARF level.

But I guess you mean for the actual code in an ELF32 file you have to have 32 bit pointers for relocations to be applied, etc? So what does it mean for for the program to have 16 bit pointers but be using 32 bit relocations?

> We are seeing a precedent of msp430-gcc to generate 32-bit pointers inside ELF32 file for MSP430 architecture and eventually we need to be able to mix-and-match linking with such objects as they could be in e.g. CRT, etc.

The ability to mix and match doesn't seem like "we have to do what GCC does" - it might still mean that there can/should be fixes to consumers. But equally I'm generally a fan of "do whatever GCC does" when it comes to DWARF, since it's far too broad to say "consumers should handle anything the DWARF spec allows".

> So, it seems we need a way to distinguish between "platform address size" that it used for code generation, optimizations, etc. and "address size used for DWARF emission" (both for data and code!).

I still don't know if we should/do need to distinguish those things, but maybe we do. I'm still not sure how the 16 bit pointers work (disregarding DWARF for a minute) with a 32 bit ELF file...

> Currently LLVM assumes that they are the same and the patch allows a platform to override it. Even more, MSP430X extention uses 20-bit pointers, so we'd end with something similar here one way or another.
>
> It seems currently they things a bit messy in how we're emitting DWARF data (see https://discourse.llvm.org/t/what-is-a-codepointersize-and-how-does-it-relate-to-dwarf/64825 for example)

Ah, yeah, that thread discusses some of the directions I'd have in mind.

If this is a regression, it'd be good to have context for where these things were changed to use DataLayout (a link to the commit(s)) and an explanation of why changing them back won't harm whatever motivated them to be changed to use DataLayout in the first place.

Then, yes, as mentioned in that thread, adding a single entry point for these things would be important (just fixing the 5 sites directly seems like a bad idea if these should all be kept in sync and there are easy ways to mistakenly use the wrong value as has been the case in the past, perhaps).

But I'd like to understand the problem better too/first - it doesn't seem right to me that DWARF would need a different size from the system/target itself. (& no, I don't especially know what the "codepointer" size is as distinct from the address size either)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148042/new/

https://reviews.llvm.org/D148042