[llvm-dev] RFC: Harvard architectures and default address spaces
Dylan McKay via llvm-dev
llvm-dev at lists.llvm.org
Mon Jul 10 22:54:10 PDT 2017
Hello all, I’m looking into solving an AVR-specific issue and would love to
hear peoples thoughts on how to best fix it.
As you may or may not know, I maintain the in-tree AVR backend, which also
happens to be (to the best of my knowledge) the first in-tree backend for a
In this architecture, code lives inside the ‘program memory’ space
(numbered 1), whereas data lives inside RAM “data space”, which corresponds
to the default address space 0. This is important because loads/stores use
different instruction/pointer formats depending on the address space used,
and so we need correct address space information available to the backend
Due to the fact that address spaces in LLVM default to 0, this means that
all global or constant variables default to living inside data space. This
causes a few issues, including the fact that the SimplifyCFG pass creates
switch lookup tables, which default to data space, causing us to emit
broken table lookups and also wasting precious RAM.
The problem - emitting pointers as operands
*NOTE*: Feel free to skip to tl;dr of this section if you don’t care too
much about the details
There are different instructions which require different fixups to be
applied depending on whether pointers are located in data space or program
Take the ICALL instruction - it performs an indirect call to the pointer
stored in the Z register.
We must first load the pointer into Z via the ‘ldi’ instruction. If the
pointer is actually a pointer to a symbol, we need to emit a AVR_LO8_LDI_GS
relocation, otherwise we emit a AVR_LO8_LDI relocation. There are a few
other cases, but they’re irrelevant for this discussion.
We can quite easily look at the GlobalValue* that corresponds to the
pointer if it is a symbol and select the fixup based on that, but that
assumes that the address spaces are always correct.
Now, imagine that the pointer is actually a function pointer. LLVM does not
expose any way to set address space in the IR for functions, but because it
derived from GlobalValue, it does have an address space, and that address
space defaults to zero. Because of this, every single function pointer in
the AVR backend that gets loaded by the ldi will be associated with data
space, and not program space, which it actually belongs to.
*tl;dr* functions default to address space zero, even though they are in a
different space on Harvard architectures, which causes silent codegen bugs
when we rely on the address space of a global value
It would be impossible to set the address space correctly on creation of
llvm::Function objects because at that point in the pipeline, we do not
know the target architecture.
Because of this, I’d like to extend TargetTransformInfo with hooks that
like getSwitchTableAddressSpace(), getFunctionAddressSpace(). I have
already got a WIP patch for this here <https://reviews.llvm.org/D34983>.
Once we have that information available to TargetTransformInfo, I propose
we add a pass (very early in the codegen pipeline) that sets the address
space of all functions to whatever value is specified in the hooks.
This works well because we don’t let frontends specify address space on
functions, nor do we even mention that functions have address spaces in the
The downside of it it is that you wouldn’t normally expect something like
an address space to change midway through the compilation process. To
counter that however, I doubt the pre-codegen code cares much about the
value of function address spaces, if at all.
On top of this, at the current point in time,
Pointer<Function>::getAddressSpace is downright incorrect on any Harvard
architecture, and for other architectures, the address space for functions
will still stay the default of zero and will not change at all.
Does anybody know anything I haven’t thought of? Any reasons why this
solution is suboptimal?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev