[PATCH] D37052: Add default address space for functions to the data layout (1/4)

Sat Aug 26 15:54:24 PDT 2017

asb added subscribers: prakhar, rengolin.
asb added inline comments.

================
Comment at: docs/LangRef.rst:1883
+    Harvard architectures can use this to specify what space LLVM
+    should place things such as functions or switch lookup tables
+    into. If omitted, the program memory space defaults to the
----------------
dylanmckay wrote:
> asb wrote:
> > theraven wrote:
> > > kparzysz wrote:
> > > > dylanmckay wrote:
> > > > > kparzysz wrote:
> > > > > > Aren't switch tables considered "data" in a Harvard architecture?
> > > > > I think in the strictest sense, you are correct.
> > > > > 
> > > > > However, I think in the real world, it makes more sense for this kind of data to live in program memory.
> > > > > 
> > > > > For example, in the AVR, all global variables (regardless of address space) live inside the executable, and this live inside program memory. The routines that run on device startup will find all variables that live in RAM, and then must copy everything over.
> > > > > 
> > > > > This is necessary because RAM is always initialised to zero on startup, if you want to have specific data in RAM, it _needs_ to be copied over.
> > > > > 
> > > > > In the cast of switch tables, we should not need to copy them over to RAM. They will never change, along with the fact that RAM accesses take longer than program memory. I imagine this is true of other Harvard architectures as well.
> > > > > 
> > > > > On top of this, if switch tables lived in RAM it would mean that any switches converted into tables by `SimplifyCFG` will branch on uninitialised memory if run without startup code. This is not a particularly strong argument as lots of bad things happen when run without startup code, but I would hope that we could at least evaluate a `switch` without it.
> > > > > 
> > > > > One other side note is that not all AVRs have RAM, and so switch tables would need to live in program memory in these chips. Again, not a super strong argument though because if you have no stack, you probably shouldn't be using LLVM.
> > > > > 
> > > > > tl;dr I think this is more a matter of practicality rather than what we consider data versus code in the academic sense.
> > > > My concern here is that a subsequent patch always puts switch tables in the program memory.  I don't know if that's valid for all architectures.
> > > I agree.  Program memory is correct for us as well, but on a number of microcontrollers you don't have load instructions at all for the program memory and so they need to be in the data-ROM address space.
> > Putting jump tables in a data section is something the ARM backend supported as of rL289784 in order to support execute-only code (i.e. no loads allowed from the code section). Would this sort of thing be better off moving towards using separating address spaces for code and data?
> That makes sense - what are thoughts on creating some sort of target-specific hook (probably `TargetTransformInfo`) specific to switch tables that can be overridden and use that instead?
> 
> 
That sounds interesting to me, but it could be a bit of a rabbit hole. @prakhar, @rengolin - do you have any thoughts on this? Did you consider such an approach for the ARM execute-only support?

https://reviews.llvm.org/D37052