[llvm-dev] Custom Instruction Cost Model to LLVM RISC-V Backend

Wed May 27 15:30:56 PDT 2020

On 27 May 2020, at 17:13, Henrik Olsson via llvm-dev wrote:
> I'm unclear on whether it works with automatic variables. Testing it in
> clang gives an error message "automatic variable qualified with address
> space".
> However here is an LLVM discussion on the semantics of alloca with
> different address spaces:
> https://lists.llvm.org/pipermail/llvm-dev/2015-August/089706.html
> Some people seem to have gotten multiple separate stacks working, according
> to my skim read.
> So it might potentially be technically supported in LLVM, but not in clang.

The alloca address space changes the address space of the stack, but
there’s still only one stack.  So Clang supports generating code with a
non-zero alloca address space, but it doesn’t support changing the address
space of individual local variables.

Bandhav, you may find it interesting to look into some of the work done
for GPU targets.  I think AMDGPU has the ability to compile arbitrary
C/C++ code for their GPU.  Ordinary C/C++ code is unaware of address
spaces, but the hardware has the traditional GPU memory model of different
private/global/constant address spaces, plus a generic address space
which encompasses all of them (but is much more expensive to access).
By default, you treat an arbitrary C pointer as a pointer in the generic
address space, but LLVM and Clang know that local and global variables are
in various other address spaces.  That creates a mismatch, which Clang
handles by implicitly promoting pointers to the generic address space
when you take the address of a local/global.  In the optimizer, you can
recognize accesses to promoted pointers and rewrite them to be accesses
in the original address space.  This is then relatively easy to combine
with address-space attributes so that you can explicitly record that a
particular pointer is known to be in a particular address space.

Of course, for the promotion part of that to be useful to you, you need
specific kinds of memory (e.g. the stack) to fall into meaningful address
ranges for your cost model, or else you’ll be stuck treating almost every
access conservatively.  You could still use address spaces to know that
specific accesses are faster, but not being able to make default
assumptions about anything will be really limiting.

That’s also all designed for an implementation where pointers in
different address spaces are actually representationally different.  It
might be overkill just for better cost-modeling of a single address space
with non-uniform access costs.  In principle, you could get a lot of work
done just by doing a quick analysis to see if an access is known to be
to the stack.

John.