[LLVMdev] Extending LLVM for high-level types
Nick Lewycky
nicholas at mxc.ca
Thu Jan 13 00:46:14 PST 2011
Alexandre Cossette wrote:
> Hi all,
>
> I'm designing a programming language named C³ (or C3). I'm already using LLVM as a back-end for my prototype compiler and it's wonderful to use. Thanks for such a great system!
>
> I now have more ambitious goals and I would like to use the LLVM IR as my internal C³ IR.
Absolutely not.
In short, LLVM is its own language. You don't need to extend LLVM IR to
support your programming language any more than you need to extend x86
processors to support it.
There's the burden of having that support. For starters LLVM's types are
purely based on the storage that they back. Most languages use type to
provide static program safety, or possibly semantics (ie., + means
string concat on a string but addition on integers). LLVM doesn't do
that. Further our types are uniqued such that any two types with the
same in-memory representation have the same LLVM type; we don't discard
names, but we don't preserve a distinction because there isn't any
distinction to preserve. That in turn allows us to do fast structural
comparisons using a pointer comparison.
Then we'd have to extend core passes like mem2reg, gvn, and all of their
dependencies. These are performance critical pieces of kit, and we
categorically reject any attempt to push in pieces of infrastructure
that won't be needed by all users. Put another way, if I want to use
LLVM for C code on a cell phone, I shouldn't need to pay the
memory/execution-time price for your LLVM changes to support C³.
Finally, you haven't detailed what benefit you expect out of your
proposal. Why can't you just lower to the existing IR and get the same
optimizations out of it? What optimizations aren't possible and why? Can
we tackle those issues instead? We've gotten very far by designing
extensions to LLVM which are language-agnostic and can be used by any
client. For example, if your language has alias analysis optimizations
that rely on high-level type information, LLVM has a TBAA (type based
aliasing analysis) design that you could employ to give LLVM the
additional information it needs to optimize with.
Sorry to sound so negative, but I'm confident that LLVM can provide you
with the same generated code quality in the same execution time, only
through a different design than you propose. If you can show us missed
optimizations (or bad compile time problems) when using the naive
approach of lowering your high-level types to llvm's low-level types,
please let us know so we can solve them case-by-case!
Nick
C³ is designed to support what I call "value-oriented programming" and
it fits naturally with the design of LLVM. The idea is to apply
SSA-based optimizations on user-defined types.
>
> I would like to know if you think this plan makes sense:
> - Add a new derived type that is uniqued by name for C³ types
> - Add new intrinsic functions for C³ expressions with special semantics
> - Emit this "extended LLVM" from my abstract syntax tree
> - Run the mem2reg pass as is for SSA construction
> - Run optimization passes that can run as is with the new type (like GVN?)
> - Run a new pass that lowers the extended LLVM to normal LLVM
> - Run (or rerun) normal LLVM optimization passes
> - Emit native code using normal LLVM
> - Profit!
>
> Alex
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
More information about the llvm-dev
mailing list