[LLVMdev] Brain dump on type merging

Thu Dec 4 01:18:05 PST 2014

----- Original Message -----
> From: "Duncan P. N. Exon Smith" <dexonsmith at apple.com>
> To: "Rafael Espíndola" <rafael.espindola at gmail.com>
> Cc: "llvm-commits" <llvm-commits at cs.uiuc.edu>, "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Thursday, December 4, 2014 12:44:39 AM
> Subject: Re: Brain dump on type merging
> 
> +llvmdev, so this gets more eyes.
> 
> These points make sense to me too.  In LTO, at least, sharing
> constants
> and metadata in the context seems mostly to work against us.
> 
> I'm less aware of the tradeoffs re: the opaque pointer type, but your
> brain dump is compelling.  I'd be interested in whether anyone thinks
> the loss of type safety there is important.
> 
> > On 2014 Dec 3, at 21:00, Chandler Carruth <chandlerc at google.com>
> > wrote:
> > 
> > Just want to go on record that:
> > 
> > 1) I completely agree that constants, types, and metadata should
> > all be module owned. It makes so much sense. This becomes
> > tremendously more appealing when (not if!) we make datalayout
> > required and frontend-provided. Because then all of these things
> > have deterministic always-available access to it, etc. Goodness.
> > Pure goodness.
> > 
> > 2) I'm 100% behind moving to an opaque pointer type. Combined with
> > mandatory datalayout, lots of nice things become possible. If we
> > can trivially move back to structural type identity, I'm all for
> > that. But I would do it in steps -- first get the opaque pointer,
> > then try to improve type merging, etc.

+1 -- We don't need types on pointers -- they don't convey any information to the optimizer. FWIW, it would not surprise me if the elimination of all of the unnecessary pointer bitcasts produces measurable compile-time savings.

 -Hal

> > 
> > On Wed, Dec 3, 2014 at 8:22 PM, Rafael Espíndola
> > <rafael.espindola at gmail.com> wrote:
> >> I have spent most of last week working on improving the type
> >> merging
> >> during LTO as a side effect of trying to fix what looked like a
> >> simple
> >> PR (pr21374).
> >> 
> >> I think I have committed all the work I have for the area for the
> >> foreseeable future, but working on it did get me thinking about
> >> how we
> >> handle types, so I want to write down some very long term ideas
> >> while
> >> they are fresh.
> >> 
> >> The first thing that is odd is that types are owned by the
> >> context.
> >> This requires hacks when we want to think of types "of a module"
> >> as we
> >> do during linking. We have agreed to move metadata ownership from
> >> the
> >> context to the modules (something I should get to in the near
> >> future).
> >> I wonder if it would make sense to move constants and types too.
> >> Other
> >> than avoiding the semantic mismatch, this would have other
> >> advantages:
> >> 
> >> * No leaks in LLVMContext (design ones at least). Once a module is
> >> deleted, all data that one normally thinks as being part of the
> >> module
> >> is gone. Right now a LLVMContext that is used for a sequence of
> >> modules will slowly leak memory as metadata, constants and types
> >> accumulate.
> >> 
> >> * Given a constant or type we would be able to get to the Module
> >> and
> >> from there to the DataLayout, removing the need for the immutable
> >> pass
> >> we have for it right now.
> >> 
> >> Other than the ownership, some other thoughts came to mind:
> >> 
> >> First, the fact that the linker merges types by name is incredibly
> >> ugly. We assign no semantic meaning to names, but your fronted
> >> better
> >> name types as we expect or lib/Linker will add a lot of casts to
> >> your
> >> modules.
> >> 
> >> Second, the hard part (and I assume the slow part) of strict
> >> structural equality (like what we had in 2.9) is the case where
> >> one
> >> type is found to be equivalent to another. This is what llvm 2.9
> >> had
> >> forwarding pointers in the types for. It can cause a cascading
> >> effect
> >> where multiple types are merged. This is now even harder since we
> >> don't have the forwarding pointer anymore.
> >> 
> >> Comparing a SCC of types to existing SCCs during linking on the
> >> other
> >> hand is relatively simple. I implemented a brute force approach in
> >> the
> >> hope of replacing the name based merging, but it was not
> >> sufficient
> >> because of opaque types.
> >> 
> >> The problem with opaque types is that we can link (resolve) them
> >> with
> >> any other type, but which one we link them with can have dramatic
> >> consequences to how many casts we have to introduce.
> >> 
> >> One approach would be to not resolve them during linking, but wait
> >> and
> >> see which casts get created and them resolve in a way that would
> >> remove the casts. This would be fairly expensive as we would have
> >> to
> >> walk the entire IR to replace types (no RAUW for types).
> >> 
> >> This is why the name based type merging we have now is so
> >> important:
> >> it provides an heuristic as to what type an opaque one should be
> >> resolved to. Type merging of non-opaque types (even with cycles)
> >> can
> >> be done without the heuristic.
> >> 
> >> This then brought to memory an idea that I have seen mentioned on
> >> informal discussions but never on the list: Maybe we should have a
> >> single pointer type instead of i8*, i32*, %foobar**, etc.
> >> 
> >> With a single pointer type we would be able to also drop opaque
> >> types
> >> (since we always use a pointer to an opaque) and cycles (since
> >> they
> >> have to go through a pointer). This will bring back structural
> >> type
> >> equality, but without the std::multimaps and PATypeHolders.
> >> 
> >> The type would be transferred to what actually uses it: load,
> >> store,
> >> gep. This would also help with cases where the FE have to
> >> introduce
> >> casts just to make LLVM happy. One case that comes to mind is when
> >> a
> >> class destructor is equivalent to a base class one. Currently
> >> clang
> >> has to introduce casts because of the different types of the this
> >> pointer.
> >> 
> >> Cheers,
> >> Rafael
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory