[LLVMdev] Newbie questions

Sun Apr 23 11:46:05 PDT 2006

On Sun, 2006-04-23 at 09:43 -0500, Archie Cobbs wrote:
> Hi,
> 
> I'm just learning about LLVM (really interesting) and have some newbie
> questions. Feel free to ignore or disparage them if they're inappropriate :-)

No worries.

> 
> My area of interest is using LLVM in a Java JVM setting. These are
> just some random questions relating to that...
> 
> 1. What is the status of the LLVM+Java effort? 

Incomplete but significant progress has been made. Misha Brukman can
tell you more.
> Is it GCJ-specific?

No, it implements its own Java compiler and bytecode translator.

>     Is there a web page? 

Not that I'm away of. But, you can obtain the source code from the llvm-
java repository via CVS. Just replace "llvm" with "llvm-java" in the
usual CVS instructions.

> I found one link via google but it was broken.

Okay, sorry, I don't know about any web site.

> 
> 2. I'm curious why the LLVM language includes no instructions for memory
>     barriers or any kind of compare-and-swap (bus locking operation). Were
>     these considered? 

Yes.

> Were they deemed too platform-specific? 

No.

> What about
>     some definition of the atomicity of instructions (e.g., is a write of
>     a 32 bit value to memory guaranteed to be atomic)? More generally does
>     the LLVM language define a specific (at least partial) memory model?

Currently the language doesn't support atomic instructions. However,
some work has been done at UIUC to implement sufficient fundamental
instructions that could permit an entire threading and synchronization
package to be constructed.  This work is not complete, and is not in the
LLVM repository yet. I'm not sure of the status at UIUC of this effort,
as it hasn't been discussed in a while. It is definitely something that
will be needed going forward.

> 
> 3. Would it make sense to extend the LLVM language so that modules,
>     variables, functions, and instructions could be annotated with
>     arbitrary application-specific annotations? 

Some of these things already inherit from the "Annotable" class which
permits Annotations on the object. However, their use is discouraged and
we will, eventually, remove them from the LLVM IR. The problem is that
Annotations create problems for the various passes that need to
understand them. We have decided, from a design perspective, that (a) if
its important enough to be generally applicable, it should be part of
the LLVM IR, not tucked away in an Annotation and (b) for things
specific to a language or system that Annotations are insufficient
anyway and a higher level construction (possibly making reference to
LLVM IR objects) would be needed anyway.

> These would be basically
>     ignored by LLVM but otherwise "ride along" with their associated items.
>     Of course, the impact on annotations of code transformations would have
>     to be defined (e.g., if a function is inlined, any associated annotations
>     are discarded).

Yes, we've been through this discussion many times before and the
solution was to not support Annotations at all as discussed above. There
are numerous issues with saving the annotations in the bytecode, how
they affect the passes, what happens to them after the code is modified
by a pass (as you noted), etc.
> 
>     The thought here is that more optimization may be possible when
>     information from the higher-level language is available. E.g. the
>     application could participate in transformations, using the annotations
>     to answer questions asked by the LLVM transformation via callbacks.

Its my opinion that those things should be handled by the higher-level
language's own passes on its AST where full semantic knowledge of the
language is available. Remember that LLVM provides an "Intermediate
Representation", not a high-level AST. The desire to support Annotations
is an attempt to force the IR into a higher level of abstraction than it
was designed for.  

The use of callbacks is problematic as it would require LLVM to manage
numerous dynamic libraries that correspond to those callbacks, provide a
scheme for understanding which callbacks to call in various
circumstances, etc. Consider a bytecode file that was generated as
linking bytecode files from four or five different languages and then
being delivered to another environment for further optimization and
execution. Are all those language's dynamic libraries available so the
callbacks can be called? 
> 
>     To give an example (perhaps this is not a real one because possibly it
>     can already be captured by LLVM alone) is the use of a Java final instance
>     field in a constructor. In Java we're guaranteed that the final field is
>     assigned to only once, and any read of that field must follow the initial
>     assignment, so even though the field is not read-only during the entire
>     constructor, it can be treated as such by any optimizing transformation.

LLVM would already recognize such a case and permit the appropriate
optimizations.

> 
> 4. Has anyone written Java JNI classes+native code that "wrap" the LLVM API,
>     so that the LLVM libraries can be utilized from Java code?

No, there's no Java interface at this time. Patches accepted :)  

There is, however, a burgeoning PyPy interfaces that is being
developed. 
> 
> Thanks,
> -Archie
> __________________________________________________________________________
> Archie Cobbs      *        CTO, Awarix        *      http://www.awarix.com
> 

You're welcome. Hope it was useful. I'm sure others will respond as
well, so stay tuned.

Reid Spencer.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20060423/12b3e473/attachment.sig>