[LLVMdev] RFC: GEP as canonical form for pointer addressing
Hal Finkel
hfinkel at anl.gov
Sat Feb 15 07:22:57 PST 2014
----- Original Message -----
> From: "Philip Reames" <listmail at philipreames.com>
> To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Friday, February 14, 2014 7:18:21 PM
> Subject: [LLVMdev] RFC: GEP as canonical form for pointer addressing
>
> RFC: GEP as canonical form for pointer addressing
>
> I would like to propose that we designate GEPs as the canonical form
> for
> pointer addressing in LLVM IR before CodeGenPrepare.
Is this not already the case? I did not think that any passes introduce inttoptr+arithmetic+inttoptr prior to CGP. On the other hand, we don't convert inttoptr+arithmetic+inttoptr into GEP when we can (which is PR14226 -- where Eli (cc'd) said this is unsafe in general).
-Hal
>
> Corollaries
> 1) It is legal for an optimizer to convert
> inttoptr+arithmetic+inttoptr
> sequences to GEPs, but not vice versa.
> 2) Input IR which does not contain inttoptr instructions will never
> contain inttoptr instructions (before CodeGenPrepare.)
>
> I've spoken with Nick Lewycky & Owen Anderson offline at the last
> social. On first reflection, both were okay with the proposal, but
> I'd
> like broader buy-in and discussion. Nick & Owen, if I've
> accidentally
> misrepresented our discussion or you've had second thoughts since,
> please speak up.
>
>
> Background & Motivation
>
> We want to support precise garbage collection(1) in LLVM. To do so,
> we
> have written a pass which inserts safepoints, read, and write
> barriers
> as appropriate. This pass needs to be able to reliably(2) identify
> pointer vs non-pointer values. Its advantageous to run this pass as
> late as practical in the optimization pipeline, but we can schedule
> it
> before lowering begins (i.e. before CodeGenPrepare).
>
> We control the initial IR which is generated and can ensure that it
> does
> not contain any inttoptr instructions. We're looking to have a
> guarantee(*) that a random LLVM optimization pass will not decide to
> replace GEPs with a sequence of ptrtoint, int arithmetic, and
> inttoptr
> which are hard for us to reason about.
>
> * "guarantee" isn't really the right word here. I'm really just
> looking
> to make sure that the community is comfortable with GEPs as canonical
> form. If some pass decides to insert inttoptr instructions into
> otherwise clean IR, I want some assurance a patch fixing that would
> stand a good chance of being accepted. I'm happy to do any cleanup
> required.
>
> In addition to my own use case, here's a few others which might come
> up:
> - Backends for targets which support different operations on pointers
> vs
> integers. Examples would be some of the older mainframe
> architectures.
> (There'd be a lot more work needed to support this.)
> - Various security related applications (e.g. CFI w.r.t. function
> pointers)
>
> I don't really want to get into these applications in detail, mostly
> because I'm not particularly knowledgeable on those topics. I'd
> appreciate any other applications anyone wants to throw out, but lets
> try to keep from derailing the discussion. (As I did to Nick's
> original
> thread on DataLayout. :))
>
> Notes:
> 1) We're not using the existing gc.root implementation strategy. I
> plan
> on explaining why in a lot more detail once we're closer to having a
> complete implementation that we can upstream. That should be coming
> relatively shortly. (i.e. months, not weeks, not years)
>
> 2) As Nick pointed out in a separate thread, other types of typecasts
> can obscure pointer vs integer classifications. (i.e. casting the
> base
> type of a pointer we then load through could load a field of the
> "wrong"
> type") I plan on responding to his point separately, but let's leave
> that out of this discussion for the moment. Having GEPs as canonical
> form is a step forward by itself, even if I decide to propose
> something
> further down the road.
>
> Philip
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-dev
mailing list