[LLVMdev] Moving towards a singular pointer type

Wed Feb 18 06:58:14 PST 2015

It might help after adding explicit types where needed and before adding the opaque pointer type, to change all pointers to a single type in the existing type system, i8* could work or maybe a weird type that would shake out any issues like i99*.

If all pointers were changed to this weird type and all GEP, load, store, etc were told to use their explicit type instead you might be able to find places where we’re still relying on pointer types before adding and converting to the opaque type.

From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of David Blaikie
Sent: Wednesday, February 18, 2015 2:01 AM
To: Chris Lattner
Cc: LLVM Developers Mailing List
Subject: Re: [LLVMdev] Moving towards a singular pointer type

On Tue, Feb 17, 2015 at 10:47 PM, David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com> > wrote:

On Tue, Feb 17, 2015 at 10:27 PM, Chris Lattner <clattner at apple.com <mailto:clattner at apple.com> > wrote:

On Feb 17, 2015, at 1:58 PM, Chandler Carruth <chandlerc at gmail.com <mailto:chandlerc at gmail.com> > wrote:
> I'm not actually worried about this change though Chris, at least w.r.t. optimizer changes being necessary. There are a few reasons:
>
> 1) The old ScalarRepl pass cared a *lot* about pointer type, but the new SROA doesn't care at all, so the biggest offender is essentially handled.

Why do you think that SRoA is the biggest “offender”?  This will pretty fundamentally changing the shape of the IR (in a good way) by presumably eliminating a ton of bitcasts etc.  This has the potential to provoke instcombine regressions, tickle things like globalopt and load/store elimination, etc.  I don’t think that any of these will be particularly difficult to fix, but I imagine that there will be a long tail of minor things.

> 2) We've recently changed our pointer canonicalization rules several times and in different ways. Each of those changes helped shake out bugs where the optimizer was relying on the pointer type for something. The number of things found has dropped dramatically with each change, so I don't think there is a huge pile of problems left hiding somewhere.

This is more reassuring for me.

> 3) Almost all of the problems we found with the changes to canonicalization were actually cases where *casts* impeded optimizations, not the different pointer type. This change will be a strict reduction in the need for casts, and thus I expect it to actually be safer than the other changes. All evidence is that most of the remaining reliance on these kinds of things are actually relying on an absence of casts. With this change, the casts will all go away.

Yes, I like this change for a number of reasons: reduction of casts, simplified type resolution stuff in libIR, etc.

> So, I'm not as worried about having a very drawn out period of fixing the optimizer. I think we'll probably uncover a few minor things that we have to fix immediately, and then when we make the change some small number of benchmarks will regress (likely on some small number of platforms). We'll have to track those down, no doubt, but I'm not worried about it preventing progress for a long time.

So long as the regressions are tracked down and fixed before the mega-patch is landed, I’m ok with making this change.  I just think that finding any ways to make it more incremental and stage it will be very well rewarded.  It will be impossible to review the resultant patch otherwise.

It should come out somewhat incremental, I think. Here's how it's shaping up/I see it going:

1) add explicit types where required in IR instructions - gep, load, byval, anything else I can find... 
  a) Provide the mechanism to specify it (in textual IR, bitcode, and LLVM IR APIs)
  b) Update callers to Clang, LLVM, and Polly, to pass that information (initially just asserting that it's the same information as was provided by the typed-pointer operands)
  c) /rely/ on that information in LLVM - stop using the pointee types

After that it might be monolithic - though I'll be trying to do it incrementally for my own sanity. 

<I was thinking I might remove ptr-to-ptr bitcasts here, before introducing the ptr type - since at this point they'll be pointless (har har) already, since the pointer-using instructions will no longer need the type from them anyway>

2) introduce opaque pointer type (initially unused)
3) Repeat:
  a) choose an instruction or other source of pointer type
  b) update type to opaque pointer
  c) find & fix frontend bugs where it was relying on pointer type in the IR
  d) commit LLVM change

It could be monolithic there (do all the ptrs in one go), though I don't think it needs to be.

On further reflection this step will probably be monolithic - it'd be really hard to script updating only some pointer types when the type is written at the use, not the def, so I'd lose the context from "this instruction now produces ptr". (why do we put the type on the use instead of the def? *shrug*)

Ah well.

4) Remove non-opaque pointer types... - maybe. Might need to leave them in to make the back-compat bitcode reading easy, but I'm not sure.

-Chris

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150218/3f201e0b/attachment.html>