[llvm-dev] RFC: Harvard architectures and default address spaces

Dylan McKay via llvm-dev llvm-dev at lists.llvm.org
Tue Jul 11 15:18:00 PDT 2017


Hello Hal,

> Add this information to DataLayout and to use that information in
relevant places.

This sounds like a much better/cleaner idea, thanks!



On Wed, Jul 12, 2017 at 1:13 AM, Hal Finkel <hfinkel at anl.gov> wrote:

>
> On 07/11/2017 12:54 AM, Dylan McKay via llvm-dev wrote:
>
> Hello all, I’m looking into solving an AVR-specific issue and would love
> to hear peoples thoughts on how to best fix it.
> Background
>
> As you may or may not know, I maintain the in-tree AVR backend, which also
> happens to be (to the best of my knowledge) the first in-tree backend for a
> Harvard architecture.
>
> In this architecture, code lives inside the ‘program memory’ space
> (numbered 1), whereas data lives inside RAM “data space”, which corresponds
> to the default address space 0. This is important because loads/stores use
> different instruction/pointer formats depending on the address space used,
> and so we need correct address space information available to the backend
> itself.
>
> Due to the fact that address spaces in LLVM default to 0, this means that
> all global or constant variables default to living inside data space. This
> causes a few issues, including the fact that the SimplifyCFG pass creates
> switch lookup tables, which default to data space, causing us to emit
> broken table lookups and also wasting precious RAM.
> The problem - emitting pointers as operands
>
> *NOTE*: Feel free to skip to tl;dr of this section if you don’t care too
> much about the details
>
> There are different instructions which require different fixups to be
> applied depending on whether pointers are located in data space or program
> space.
>
> Take the ICALL instruction - it performs an indirect call to the pointer
> stored in the Z register.
>
> We must first load the pointer into Z via the ‘ldi’ instruction. If the
> pointer is actually a pointer to a symbol, we need to emit a
> AVR_LO8_LDI_GS relocation, otherwise we emit a AVR_LO8_LDI relocation.
> There are a few other cases, but they’re irrelevant for this discussion.
>
> We can quite easily look at the GlobalValue* that corresponds to the
> pointer if it is a symbol and select the fixup based on that, but that
> assumes that the address spaces are always correct.
>
> Now, imagine that the pointer is actually a function pointer. LLVM does
> not expose any way to set address space in the IR for functions, but
> because it derived from GlobalValue, it does have an address space, and
> that address space defaults to zero. Because of this, every single function
> pointer in the AVR backend that gets loaded by the ldi will be associated
> with data space, and not program space, which it actually belongs to.
>
> *tl;dr* functions default to address space zero, even though they are in
> a different space on Harvard architectures, which causes silent codegen
> bugs when we rely on the address space of a global value
> Proposed solution
>
> It would be impossible to set the address space correctly on creation of
> llvm::Function objects because at that point in the pipeline, we do not
> know the target architecture.
>
> Because of this, I’d like to extend TargetTransformInfo with hooks that
> like getSwitchTableAddressSpace(), getFunctionAddressSpace(). I have
> already got a WIP patch for this here <https://reviews.llvm.org/D34983>.
>
> Once we have that information available to TargetTransformInfo, I propose
> we add a pass (very early in the codegen pipeline) that sets the address
> space of all functions to whatever value is specified in the hooks.
>
> This works well because we don’t let frontends specify address space on
> functions, nor do we even mention that functions have address spaces in the
> language reference.
>
> The downside of it it is that you wouldn’t normally expect something like
> an address space to change midway through the compilation process. To
> counter that however, I doubt the pre-codegen code cares much about the
> value of function address spaces, if at all.
>
> On top of this, at the current point in time, Pointer<Function>::
> getAddressSpace is downright incorrect on any Harvard architecture, and
> for other architectures, the address space for functions will still stay
> the default of zero and will not change at all.
>
> Does anybody know anything I haven’t thought of? Any reasons why this
> solution is suboptimal?
>
>
> Hi, Dylan,
>
> Being able to specify the address space of functions, etc. is a good idea.
> Given the current design, you can't put this into TargetTransformInfo,
> however, because nothing in TTI may be required for correctness (because
> your target's implementation might not be available). Information required
> for correctness must go in DataLayout (because it must always be
> available). You should propose patches to add this information to
> DataLayout and to use that information in relevant places.
>
>  -Hal
>
>>
>
> _______________________________________________
> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170712/4de81aae/attachment.html>


More information about the llvm-dev mailing list