[llvm-dev] RFC: Opaque pointer status and future direction

Wed Dec 18 15:15:46 PST 2019

On Wed, Dec 18, 2019 at 7:16 AM Tim Northover via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Hi all,
>
> At the dev meeting I promised to update everyone on where my work with
> opaque
> pointers is right now. It's taken me a while, but at least it's the same
> year!
>
> Current Status
> ==============
>
> I've put two branches up at https://github.com/TNorthover/llvm-project:
> "opaque-ptr" which has most of the real work so far; and
> "opaque-ptr-always"
> that additionally has a patch to force every pointer to be opaque and see
> what
> falls over. It's about 40 patches on top of master in a few categories.
>
> 1. Serialization: bitcode <-> in-memory <-> textual IR[0].
> 2. Relaxing assertions in Instruction constructors and the Verifier so
>    that we don't assume every pointer has an element type.
> 3. Modifying passes and other components to get their element types from
> other
>    sources when needed. This is where I see the bulk of the future work in
> LLVM
>    itself.
>
> All of them are very much from my dev machine and not prepared for real
> review.
>
> To give an idea of the work ahead, on "opaque-ptr-always", running "ninja
> check"
> there are about 4500 failures.
>
> Many of these are of course CHECK lines looking for typed pointers that
> LLVM
> will never again print; we'll need some kind of script to automate
> converting as many
> of those tests as possible.
>
> Byeond that, there are still about 800 assertion failures, but looking at
> the
> backtraces I think that there are "only" 75ish distinct callsites[1] that
> would
> have to be fixed, plus whatever's revealed behind them.
>
>
> Future Direction
> ================
>
> I think this work needs to happen more incrementally. It's really not
> great that
> I've built up a backlog of 40 patches that only I have access to and can
> work
> on.
>
> So at a high level I think we should put the serialization and Instruction
> changes in sooner rather than later, giving us a largely undocumented[2]
> dialect
> of IR with opaque pointers that we can write tests against to upstream the
> rest
> of what I've done (and others can use to continue work in parallel if
> they're
> inclined).
>

My, admittedly rather vague, plan was to change the API down to the point
where there was only a primitive for "propagating pointee type from one
place to another" but without the ability to query it otherwise - well,
with a deprecated way to query it that we could chase down calls to as the
main migration. Once we got to zero "getElementType" callers we could
figure out the actual IR migration piece.

Do you think that wouldn't be viable & that introducing the new opaque
pointer type sooner would be better/more viable?

> The risk is of course that this becomes yet another unfinished feature we
> drag
> around for years, with a corresponding maintenance burden. And it's a real
> risk,
> I unfortunately don't have the go ahead to work on this full time.
>
> But I don't think the alternatives are much better. Even full time I don't
> think
> I could do it completely alone because some choices will need input from
> experts. Even if I could, it would finish with a patch bomb even bigger
> than
> what I'm dropping here.
>
>
> Proposal
> ========
>
> Short term (because otherwise we can't do it for another six months):
>
> 1. Add inalloca(<ty>) support.
>

Is byval already fixed/changed? I may've lost track of some of these
changes, but I knew that was next on my list. (& how was byval addressed -
byval(<ty>) or byval(byte count)? I guess inalloca goes/should go the same
way as byval)

> 2. Document for January release the planned removal of:
>   * Old style byval
>   * Old style inalloca
>   * Typeless CreateCall, CreateLoad, CreateGEP.
>

Sounds good to me.

> 3. Soon after January branch, strip out those bits. The third in
> particular should prevent front-end regression, I had to fix a fair
> few new deprecated callsites in Clang when rebasing everything this
> week.
>
> Short/medium term:
>
> 1. Commit serialization and Instruction changes.

2. Use that to add tests for patches I already have and upstream them.
> 3. Keep fixing the issues, but no-one not working on opaque pointers
> should need
>    to change their behaviour other than a general encouragement to not use
>    getElementType unless they have to.
>
> Long term:
>
> As we get close to everything working, we should shift the expectations so
> that
> new uses of getElementPtr aren't allowed in LLVM.
>
> Front-ends (including Clang) will need more work I suspect.
>
> No doubt there will be performance issues where having a pointee type
> helps some
> heuristic be a bit better. We'll have to decide what to do about those.
>
> [0] See attached opaque.ll for some proposed IR.
> [1] See attached asserts.txt if interested.
> [2] Or perhaps more likely documented with "don't use it unless you're
> working
> on opaque pointers" warnings.
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191218/346e1a48/attachment.html>