[LLVMdev] Lost in the documentation

Hendrik Boom hendrik at topoi.pooq.com
Tue Apr 29 10:59:44 PDT 2008

On Tue, 29 Apr 2008 09:46:35 -0400, Gordon Henriksen wrote:

> On 2008-04-29, at 08:41, Hendrik Boom wrote:
>> On Mon, 28 Apr 2008 17:54:31 -0400, Gordon Henriksen wrote:
>>> On Apr 28, 2008, at 17:32, Hendrik Boom wrote:
>>>> In http://llvm.org/docs/FAQ.html, when taking about writing a
>>>> compiler
>>>> that uses LLVM (at least I think that's what the FAQ question is
>>>> asking),
>>>> the FAQ recommends
>>>>> #  Call into the LLVM libraries code using your language's FFI
>>>>> (foreign
>>>>> function interface).
>>>>>   * for: best tracks changes to the LLVM IR, .ll syntax, and .bc
>>>>>          format
>>>>>   * for: enables running LLVM optimization passes without a
>>>>>          emit/parse overhead
>>>>>   * for: adapts well to a JIT context
>>>>>   * against: lots of ugly glue code to write
>>>> Now, which particular libraries would that be
>>> With the exception of the 'util' and 'tools' directories, the entire
>>> LLVM source tree consists of libraries.
>> Indeed, quite a lot of them.  Most of them appear to be internal. I'm
>> trying to identify the ones that are intended for use by LLVM users.
> include/llvm is all public (modulo some implementation details as
> required by the nature of C++). Private includes are in lib. But realize
> that not all users are front-end compilers. A back-end code generator is
> also a user of the framework; as is an IR optimization or analysis. The
> C++ interfaces support all of these clients equally.
> VMCore and BitWriter are the libraries absolutely necessary for any
> static compiler that outputs bitcode. You'll likely want Analysis for
> the verifier; and Target for memory layout information. That's the
> basics.
>> I have to say I missed the crucial paragraph:
>> : If you go with the first option, the C bindings in include/llvm-c :
>> should help a lot, since most languages have strong support for :
>> interfacing with C. The most common hurdle with calling C from managed
>> : code is interfacing with the garbage collector. The C interface was :
>> designed to require very little memory management, and so is :
>> straightforward in this regard.
>> Evidently I have to go look in include/llvm-c, since I stronlgly
>> suspect
>> you didn't go to the trouble of writng a C wrapper for anything that
>> wasn't needed by an LLVM user.  Anything internal you'd have left in
>> C++.
>> So the API for a C++ *user* could be described as "those parts of the
>> internals API that happen to be used in implementing llvm-c.
> That's a rather poor definition. Only bindings for such features as have
> been required are authored. Still, if this helps you make sense of the
> framework, then that's fantastic; but remember that it is an imperfect
> rule.
> Using the C bindings, it's still very important to understand the
> underlying C++ object model; otherwise, the type rules for the bindings
> will appear to be rather capricious.
>> Putting this together with the tutorial, http://llvm.org/docs/tutorial/
>> ,
>> which uses CAML instead of C, I think I may be able to get a clue.
> If you're not using ocaml, the C++ tutorial (the first one on that page)
> is probably more pertinent, even if you do intend to use the C bindings.
> Searching the implementation of the bindings (lib/VMCore/ Core.cpp,
> etc.) is helpful for "going backwards" from C++ to C once you begin to
> understand the object model.
>>>> where are their API(s) documented?
>>> http://llvm.org/docs/
>>> http://llvm.org/doxygen/
>>> http://llvm.org/docs/tutorial/
>>> etc etc etc.
>>> — Gordon
>> The doxygen page describes the complete internal structure of LLVM. It
>> explicitly says,
>> ; This documentation describes the internal software that makes up
>> LLVM,
>> ; not the external use of LLVM. There are no instructions here on how
>> to
>> ; use LLVM, only the APIs that make up the software. For usage ;
>> instructions, please see the programmer's guide or reference manual.
>> I haven't yet found a "programmer's guide".
> http://llvm.org/docs/ProgrammersManual.html

Here's what I have in mind to do with LLVM.  Thanks.  I have a few 
languages to compile;  all of them require garbage collection.  I'll be 
looking at the ocaml experience with some interest.  How far I get into 
implementing them depends on the available time. and the state of my 
enthusiasm.  It has been known to go missing, and it often gets diverted 
to so-called real life.

One of these languages, Algol 68, I was working on about 35 years ago.  
It was not finished mainly because at some point the machinery I was 
developing it on became unavailable.  It correctly ran over half of a 
demanding test suite when the project stopped.  It's now something I'd 
like to finish more for old time's sake than any serious use.  35 years 
ago, this compiler would run in about 900K memory.  That was a dream 
machine back then.  Using an overlay linker, it could be crammed into 
400K.  It was written in Algol W, and could use a new portable code 
generator.  It used garbage collection at compile time, but on today's 
machines I could probably get away with wholesale memory leakage.

To get it working, of course I need something that implements Algol W.  
I've tinkered with translating Algol W to C or something similar.  I 
originally intended to translate the Algol 68 compiler into Algol 68, to 
make it self-supporting, but I never got that far.  I have an Algol W 
parser, and at least one ancient attribute grammar that (too slowly) 
translates it to something else.  Since I'll only be using it to develop 
Algol 68, which runs in 900K, I can probably dispense with garbage 
collection and just use my 4 gigabyte RAM instead.

I also have a self-implementing program-transformation tool. It consists 
of a recursive-descent parser generator, a tree-rewriting system, and an 
unparser.  In principle, it needs garbage collection.  In practise, well, 
I've said it before.  Memories are large these days.

-- hendrik

> — Gordon

More information about the llvm-dev mailing list