[PATCH] D11969: Avoid using of DataLayout::getPointerSize() without address space argument in DWARF info generation

Tue Aug 11 22:03:12 PDT 2015

rampitec added a comment.

In http://reviews.llvm.org/D11969#222450, @joker.eph wrote:

> I don't have enough context and the description not clear enough for me to understand what is going on. The fact that you claim that the default DataLayout getPointerSize() does not return the expected size sounds fishy to me.
>  Can you elaborate?

Sure. In our architecture there are several distinct memory segments physically residing in different memory. These segments even have separate instructions for memory access and separate capabilities. Notably, some of the segments use 32 bit pointers and some 64 bit. Then different address spaces are used to represent these memory segments. In particular private per-thread memory which is usually used for alloca reside in address space 0, while shared global memory is in the address space 1. Private memory has 32 bit addressing and global 64 bit. That is represented by the llvm's data layout string fragment "p0:32:32-p1:64:64". In essence this is a 64 bit architecture, with just some memory types being 32 bit.

Now, if you look at the DataLayout::getPointerSize() it has an argument AS, which stands for the address space. That is the declaration: "unsigned 	getPointerSize (unsigned AS=0) const". Note the AS has a default 0, which is usually all one need on x86 CPU. There is also a comment related to declaration: "FIXME: The defaults need to be removed once all of the backends/clients are updated." It is here: http://llvm.org/docs/doxygen/html/classllvm_1_1DataLayout.html#a9e653935f1d7ff84ff4a14667f4ca567

So basically this call is never supposed to be used without an argument, default is only for transition period. That is not possible to tell pointer size without knowing its address space on all architectures. Over the time the default shall be removed and the only showstopper are backends which need the update. No common code supposed to call it without an argument.

Yet, as I said, that is common to say an architecture is 64 bit even if not all segments in it are 64 bit. That is like medium or compact model in early x86 with some pointers 32 bit and some 16 bit. One application where you practically need to use general pointer size is DWARF and essentially anything where you would like to refer to a largest type which can hold any target pointer. So a call with the default argument 0 may work for this purpose even for the architectures with mixed pointer sizes, but unfortunately does not work if pointer in a zero address space is smaller than in some other.

BTW, there can be another approach here. We can create two different versions of getPointerSize() in DataLayout (and then 2 versions of getPointerSizeInBits() and all versions of alignment queries etc), one with the argument and another without. The one without may return a maximum size of pointers in all address spaces. It looks very easy to do, but generally error prone because encourages to use the version without an argument, just as the default =0 does now.

Repository:
  rL LLVM

http://reviews.llvm.org/D11969