[LLVMdev] "Mapping High-Level Constructs to LLVM IR"

David Tweed david.tweed at gmail.com
Mon Nov 25 01:04:04 PST 2013


Hi, documentation is always good and this is a great idea. It'll be
particularly useful as a place where additional examples of constructs from
non-C-family languages could be added (since most compiler tutorials
inevitably focus on languages that are a lot like C/C++/Obj-C).

I'd imagine you've already thought of this, but it might be something where
using pseudo-LLVM-IR is of the most pedagogical use. (For example, writing
code that accesses complex structured memory using multiple levels of GEP's
and then loads is quite tricky, but for a lot of the constructs it's only a
detail so you could probably express those bits using some pseudo
instruction (going another step in to fully explicit LLVM-IR if necessary.)

Cheers,
Dave


On Sat, Nov 23, 2013 at 6:18 AM, Mikael Lyngvig <mikael at lyngvig.org> wrote:

> Thanks, you have a lot of valid points there.  I have myself long ago
> abandoned the path of using C as a backend language due to the very factors
> you mention.
>
> However, as I said, the document was put together in 30 minutes.  Not
> exactly ready for prime time :-)
>
> I do agree that all of the things you mention should be described,
> including Lambdas, closures, and generators, but I must admit up front that
> I don't know how to implement half of them.  But I suppose I could do a lot
> of research and perhaps occasionally ask you guys for specifics.
>
> We are not going to find much common ground on the issue of "calling
> propagated return values for exception handling", I think :-)  See
> https://www.lyngvig.org/Teknik/A-Proposal-for-Exception-Handling-in-C for
> the details.
>
> I started out with C++ as the example language because a lot of people
> know that language - and most certainly the majority of the LLVM user base.
>  Obviously, you'd have to add source code from other languages than C++
> when C++ does not provide features to illustrate the process.
>
> I now agree that the lowering into C is not such a good idea after all.
>  So I'll go straight from source language to LLVM IR, which is not that
> difficult after all, and won't be very different for the reader.  In fact,
> I think it will be much better than my original approach.
>
> Thanks again for your valid objections.
>
>
> -- Mikael
>
>
>
>
> 2013/11/23 Joshua Cranmer 🐧 <Pidgeot18 at gmail.com>
>
>> On 11/22/2013 9:25 PM, Mikael Lyngvig wrote:
>>
>>> Hi guys,
>>>
>>> I have begun writing on a new document, named "Mapping High-Level
>>> Constructs to LLVM IR", in which I hope to eventually explain how to map
>>> pretty much every contemporary high-level imperative and/or OOP language
>>> construct to LLVM IR.
>>>
>>> I write it for two reasons:
>>>
>>> 1. I need to know this stuff myself to be able to continue on my own
>>> language project.
>>> 2. I feel that this needs to be documented once and for all, to save
>>> tons of time for everybody out there, especially for the language inventors
>>> who just want to use LLVM as a backend.
>>>
>>> So my plan is to write this document and continue to revise and enhance
>>> it as I understand more and helpful people on the list and elsewhere
>>> explain to me how these things are done.
>>>
>>> Basically, I just want to know if there is any interest in such a
>>> document or if I should put it on my own website.  If you know of any books
>>> or articles that already do this, then please let me know about them.
>>>
>>> I've attached the result of 30 minutes work, just so that you can see
>>> what I mean.  Please don't review the document as it is still in its very
>>> early infancy.
>>>
>>
>> There is a strong bias towards C++ in the document, which isn't a
>> particularly strong slice of higher-level constructs. For example, C++'s
>> RTTI constructs serve three distinct purposes: exception handling, dynamic
>> casts, and reflection (although C++'s reflection capabilities are extremely
>> weak). You'll need to talk about inheritance in the three cases: single,
>> multiple, and virtual (to use C++'s terminology) (note that Java's
>> interfaces can be implemented as virtual inheritance). Boxing is another
>> important topic. Lambdas, closures, and generators (yield keyword) are
>> becoming increasingly common in modern programming languages, and should
>> not be ignored.
>>
>> Finally, calling propagated return values "exception handling" does an
>> extreme disservice to your readers. LLVM IR explicitly models exception
>> handling, and attempting to describe it lowered as return values is not how
>> anyone should implement it. If you badly want to describe it in C terms,
>> you could at least use C's setjmp/longjmp to describe it; the truth is,
>> this is a feature which doesn't exist cleanly in C.
>>
>> Trying to describe mapping higher-level languages to C and then C to IR
>> is a poor idea. C is in some ways an extremely limited language (no native
>> exception handling constructs, e.g.). If you want to be a guide to how to
>> lower languages to LLVM IR, you need to also explain how to take advantage
>> of features in the IR to optimize code better (e.g., TBAA). Cfront-like C++
>> compilers are extremely rare-to-nonexistent (in part because it is
>> difficult to map some features, most notably exception handling, cleanly
>> and efficiently into C); if your guide is describing such an approach, it
>> reads like an implicit endorsement. It is possible to describe some aspects
>> of the IR in C, but if the goal is to lower to IR, then the description
>> should be lowering to IR, not lowering to C.
>>
>> --
>> Joshua Cranmer
>> Thunderbird and DXR developer
>> Source code archæologist
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>


-- 
cheers, dave tweed__________________________
high-performance computing and machine vision expert: david.tweed at gmail.com
"while having code so boring anyone can maintain it, use Python." --
attempted insult seen on slashdot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131125/73cf87fe/attachment.html>


More information about the llvm-dev mailing list