[llvm-dev] RFC: Absolute or "fixed address" symbols as immediate operands

Chandler Carruth via llvm-dev llvm-dev at lists.llvm.org
Wed Oct 26 10:10:28 PDT 2016


To what Reid said, I'm not really worried about impact on the middle end of
any of this. We can handle the code changes, etc.

I agree with Chris about what we're trading off here:

On Tue, Oct 25, 2016 at 10:48 PM Chris Lattner via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> I’d argue the other side of it.  The quality of the code is higher if we
> have invariants (like all globals are pointers) because that simplifies
> assumptions by eliminating cases where “is a pointer” appears to be true,
> but isn’t actually true in all cases.  I’m not an expert on CFI or how
> widely it will ultimately impact the compiler hacker consciousness, but I’m
> pretty sure that the current model for globals and functions will remain
> more prominent.  If you choose to break this invariant, you’ll be
> continually swimming upstream against assumptions made throughout the
> compiler, both in code written today but also in code written in the future.
>

I agree that mental assumptions the developers on the middle end hold are
the primary challenge here. But I think we are going to run into challenges
either way.

If the type of these entities is an integer, we will have a non-pointer
global, yes. But as Peter points out, this is caught effectively by asserts
in the cast infrastructure and other programming aids. Essentially, the
checking of LLVM's type system helps protect the random middle end
developer from getting this wrong.

On the other hand, if the type of these entities remains consistently
pointers, we will still break assumptions that middle end developers
routinely make about pointers to globals:
- They aren't dereferencable
- They aren't aligned
- They may be null
- The difference between them might not be representable in a
pointer-sized-integer

In essence, their *values* won't behave like pointers even if we make the
LLVM IR type a pointer. So the wrong assumption will shift from an IR type
system error to a value error. I would generally much prefer the LLVM IR's
type system catch this kind of error. Even if these wrong assumptions about
the values are much less common, I would prefer the more common but easily
caught type system error.


To Rafael's point, while I agree that at the object file level these are
indeed addresses, I personally am much more interested in the IR modeling
things in ways convenient to the middle end optimizer than to the linker.
And there, the above seems like the dominant tradeoff.


Anyways, if Reid, Chris, and Rafael all strongly feel like keeping the
types consistent is actually the right tradeoff, I don't want to stand in
the way. So far, I just find the arguments for why this is the right
tradeoff unconvincing.

-Chandler


PS: In case it isn't clear, I'm totally fine with having the range metadata
available for the case where globals are mapped into very specific regions
for bare metal / embedded architectures, but *should* be treated as actual
addresses of objects that can be loaded and stored through. And those seem
unambiguously like they should be pointers. But that seems like a separate
use case and discussion...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161026/f22db89c/attachment.html>


More information about the llvm-dev mailing list