[llvm-dev] [EXT] [RFC] arm64_32: upstreaming ILP32 support for AArch64

Fri Feb 1 02:28:08 PST 2019

Hi Eli,

Thanks for the comments.

On Thu, 31 Jan 2019 at 19:48, Eli Friedman <efriedma at quicinc.com> wrote:
> > We teach CodeGenPrepare to sink GEPs as GEPs, and preserve the
> > inbounds marker. This is the only way they can possibly be exposed to
> > SDAG at the basic block level.
>
> Isn't addr-sink-using-gep already a thing?

Yes, I'm not sure why I wrote that (maybe I saw the new
addrSinkUsingGEPs in a patch and misremembered). It looks like what I
actually did was attempt to decouple the logic. It's currently based
on useAA, which seems to be an orthogonal question to me, so I added a
new virtual function hook. I'm now suspicious of the logic there too,
though. I'll inspect it further before uploading anything for review.

> > Second is the intrusiveness. On the plus side it's less intrusive than
> > ISD::GEP would be, but it still involves changes in some fairly
> > obscure bits of DAG -- often found when things broke rather than by
> > careful planning.
>
> Did you consider modeling this with address spaces?  LLVM already has robust support for address spaces with different pointer sizes,

I have to say I didn't, but I don't think it would solve the problem.
Alternate address-spaces still have just one pointer size per space as
far as I'm aware. If that's 64-bits we get efficient CodeGen but
loading or storing a pointer clobbers more data than it should, if
that's 32-bits then we get poor CodeGen.

> and you probably want to expose support for 64-bit pointers anyway.

It's a possibility, though no-one has asked for it yet. The biggest
request we've actually had is for signed 32-bit pointers so that both
TTBR0 and TTBR1 regions can be used. I could see a pretty strong
argument for exposing unsigned pointers via a different address-space
in that regime (for use in user_addr_t in kernel code), though you'd
have to be pretty disciplined to make it work I think.

> I'm not sure I follow the difference between [2 x i32] and i64: if they both go into a single register, why do you need both?  Or is this necessary to support your automatic translation pass?

Yep, it's entirely because we need to support code generated for
armv7k. On that platform [2 x i32] and i64 have different alignment
requirements on the stack; [2 x i32] would be used for struct {
int32_t val[2]; }, i64 would be used for struct { int64_t val; }. But
because AArch64 AAPCS puts more data in registers, some of these args
generated for the stack go in registers on arm64_32.

Cheers.

Tim.