[LLVMdev] Extending LLVM for high-level types

Thu Jan 13 16:54:32 PST 2011

On 01/13/2011 03:46 AM, Nick Lewycky wrote:
> Absolutely not.
>
> In short, LLVM is its own language. You don't need to extend LLVM IR to
> support your programming language any more than you need to extend x86
> processors to support it.
>
> There's the burden of having that support. For starters LLVM's types are
> purely based on the storage that they back. Most languages use type to
> provide static program safety, or possibly semantics (ie., + means
> string concat on a string but addition on integers). LLVM doesn't do
> that. Further our types are uniqued such that any two types with the
> same in-memory representation have the same LLVM type; we don't discard
> names, but we don't preserve a distinction because there isn't any
> distinction to preserve. That in turn allows us to do fast structural
> comparisons using a pointer comparison.
>
> Then we'd have to extend core passes like mem2reg, gvn, and all of their
> dependencies. These are performance critical pieces of kit, and we
> categorically reject any attempt to push in pieces of infrastructure
> that won't be needed by all users. Put another way, if I want to use
> LLVM for C code on a cell phone, I shouldn't need to pay the
> memory/execution-time price for your LLVM changes to support C³.
>
> Finally, you haven't detailed what benefit you expect out of your
> proposal. Why can't you just lower to the existing IR and get the same
> optimizations out of it? What optimizations aren't possible and why? Can
> we tackle those issues instead? We've gotten very far by designing
> extensions to LLVM which are language-agnostic and can be used by any
> client. For example, if your language has alias analysis optimizations
> that rely on high-level type information, LLVM has a TBAA (type based
> aliasing analysis) design that you could employ to give LLVM the
> additional information it needs to optimize with.
>
> Sorry to sound so negative, but I'm confident that LLVM can provide you
> with the same generated code quality in the same execution time, only
> through a different design than you propose. If you can show us missed
> optimizations (or bad compile time problems) when using the naive
> approach of lowering your high-level types to llvm's low-level types,
> please let us know so we can solve them case-by-case!
>
> Nick

I think that what Alexandre wants to do is to leverage the power of the 
LLVM SSA transformation/optimization framework for types that might not 
be natively defined by LLVM.  This is something that I believe is 
already possible in LLVM (with the addition of some select user-defined 
passes and careful use of types), but it can be awkward to use due to 
the structure typing inherent in LLVM.  For example, I define one of the 
custom types in my language to i64, but this only makes sense as long as 
I can uniquely identify this type as i64 - that is I haven't overloaded 
i64 to mean anything else.  Other types could be introduced as other 
bit-width integers (i65), structure types, etc.  So it's possible, if 
not clean.

Actually, looking over the list of optimizations on LLVM IR I'm having 
trouble finding more than a handful that explicitly rely on the storage 
type of all data.  So it seems like a very valid use case to use LLVM 
for optimization with user-specific types within SSA form, before 
lowering the code (or translating back to source).

Andrew