[LLVMdev] Parametric polymorphism

Wed Feb 18 13:27:21 PST 2009

> I was thinking of the "T extends Comparable" part, which does involve
> subtype polymorphism.  Apologies if I'm getting terms mixed up.

It was a bad example -- not close enough to actual LLVM.  :-)

> What do the parametrized types give you that you don't get from using
> opaque instead of T?

Possibly nothing.  I don't really understand the limitations of opaque.
Is it possible to declare a structure type { int, opaque, int }, and then
use getelementptr to retrieve the third element?  I'm guessing not,
because there's no way for the code generator to calculate the
correct offset.

If T is a type variable, then the type { int, T, int } should be valid.
Eventually of course, the native code generator will need to know
the size of T, but that shouldn't worry other optimization passes
that are only dealing with the IR.

> Type checking has already passed by the time it
> reaches this level, and you would simply add the appropriate bitcasts
> when generating the functions to be added to the dictionaries.

Assuming that all values are 32-bit.  Oops -- can't use doubles, long
doubles, or compile to 64-bit architectures.  Doh!   ;-)

> I wonder if the "instantiation" could be done instead as a normal pass

Yes, and it should.

> inter-procedural points-to analysis.  Getting rid of the indirection
> in the generic dictionary seems like the same problem as
> devirtualizing method calls, so a unified solution would be nice.

Devirtualization is actually a lot trickier, since it relies on whole
program analysis and type information from the high-level language;
we need to know that a class C has no subclasses in order to
devirtualize its methods.  Getting rid of the dictionary indirection is
a simple matter of constant propagation and inlining; no type wizardry
required.  (This is one reason why I'm not a big fan of OOP type
systems.)

> But specialization (in the C++ template sense) is also a great example
> of why it's needed in the host language, as is overloading.

There's no reason why specialization -has- to be done by llvm.
Idiotic languages like C++ can do it themselves, badly, if they want. ;-)
I want llvm to do it for me because it can do a better job.  ;-)

>>  void getSecond(int sizeT, void* pair, void* result) {
>>    return memcpy(result, ((char*) pair) + sizeT, sizeT);
>>  }
>
> Of course, that only works for POD types, so you still need a
> different instantiation for std::string, std::vector, ...

Every llvm type is a POD type.  It's either a scalar, a vector,
an array, a struct, or a pointer.  Every one of those types has a
fixed size.  The C++ compiler is supposed to translate complicated
thingies like std::string into a POD type that llvm can understand.

> There's no possibility of implementing C++'s complicated lookup and
> resolution rules down at the LLVM level,

Why on earth would we want to do anything like C++ lookup?  I
tried writing a C++ parser once, and I think the Obama administration
can easily use it as a geneva-convention-friendly alternative to
waterboarding on suspected terrorists.  :-)

> Once you restrict yourself to generics, you're only ever passing
> pointers around, which, as you said, is the easy case (relatively),
> since you don't need the type information at all once past the
> front-end.

Yes, and Java generics are dog-slow because they can only use
pointers.  Try implementing a generic complex number class in Java,
and watch the two-order-of-magnitude drop in performance on scientific
code.

  -DeLesley