<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p>Hi,</p>

    <p>i am the main author of RV, the Region Vectorizer

      (github.com/cdl-saarland/rv). I want to share our standpoint as

      potential users of the proposed vector-length agnostic IR (RISC-V,

      ARM SVE).<br>

    </p>

    <p>-- support for `llvm.experimental.vector.reduce.*` intrinsics --<br>

    </p>

    <p>RV relies heavily on predicate reductions (`or` and `and`

      reduction) to tame divergent loops and provide a vector-length

      agnostic programming model on LLVM IR. I'd really like to see

      these adopted early on in the new VLA backends so we can fully

      support these targets from the start. Without these generic

      intrinsics, we would either need to emit target specific ones or

      go through the painful process of VLA-style reduction trees with

      loops or the like.</p>

    <br>

    -- setting the vector length (MVL) --

    <p>I really like the idea of the `inherits_vlen` attribute. Absence

      of this attribute in a callee means we can safely stop tracking

      the vector length across the call boundary.<br>

    </p>

    <p>However, i think there are some issues with the `vlen token`

      approach.<br>

    </p>

    <p>* Why do you need an explicit vlen token if there is a 1 : 1-0

      correspondence between functions and vlen tokens?<br>

    </p>

    <p>* My main concern is that you are navigating towards a local

      optimum here. All is well as long as there is only one vector

      length per function. However, if the architecture supports

      changing the vector length at any point but you explicitly forbid

      it, programmers will complain, well, i will for one ;-) Once you

      give in to that demand you are facing the situation that multiple

      vector length tokens are live within the same function. This means

      you have to stop transformations from mixing vector operations

      with different vector lengths: these would otherwise incur an

      expensive state change at every vlen transition. However, there is

      no natural way to express that two SSA values (vlen tokens) must

      not be live at the same program point.<br>

    </p>

    <div class="moz-cite-prefix">On 06/11/2018 05:47 PM, Robin Kruppe

      via llvm-dev wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CAJrduR5cH64hq8M-JS7jVrVWhmFBDRMqKecU=p3dGkc2pZNYDg@mail.gmail.com">

      <pre wrap="">There are some operations that use vl for things other than simple

masking. To give one example, "speculative" loads (which silencing

some exceptions to safely permit vectorization of some loops with

data-dependent exits, such as strlen) can shrink vl as a side effect.

I believe this can be handled by modelling all relevant operations

(including setvl itself) as intrinsics that have side effects or

read/write inaccessible memory. However, if you want to have the

"current" vl (or equivalent mask) around as SSA value, you need to

"reload" it after any operation that updates vl. That seems like it

could get a bit complex if you want to do it efficiently (in the

limit, it seems equivalent to SSA construction).</pre>

    </blockquote>

    I think modeling the vector length as state isn't as bad as it may

    sound first. In fact, how about modeling the "hard" vector length as

    a thread_local global variable? That way there is exactly one valid

    vector length value at every point (defined by the value of the

    thread_local global variable of the exact name). There is no need

    for a "demanded vlen" analyses: the global variable yields the value

    immediately. The RISC-V backend can map the global directly to the

    vlen register. If a target does not support a re-configurable vector

    length (SVE), it is safe to run SSA construction during legalization

    and use explicit predication instead. You'd perform SSA construction

    only at the backend/legalization phase.<br>

    Vice versa coming from IR targeted at LLVM SVE, you can go the other

    way, run a demanded vlen analysis, and encode it explicitly in the

    program. vlen changes are expensive and should be rare anyway.<br>

    <br>

    <tt>; explicit vlen_state modelling in RV could look like this:<br>

      <br>

    </tt>

    <span class="vg"><tt>@vlen_state</tt></span><tt> </tt><tt><span

        class="p">=</span></tt><tt> </tt><tt><span class="k">thread_local

      </span></tt><tt><span class="k">global</span></tt><tt> token ;

      this gives AA a fixed point to constraint vlen-dependent

      operations</tt><br>

    <span class="m"></span><br>

    <span class="m"></span><tt>llvm.vla.setvl(i32 %n)                  ;

      implicitly writes-only %vlen</tt><tt>_state<br>

    </tt><tt>i32 llvm.vla.getvl()                    ; implicitly

      reads-only %vlen_state</tt><br>

    <tt><br>

      llvm.vla.fadd.f64(f64, f64)           ; implicitly reads-only

      %vlen_state</tt><br>

    <tt>llvm.vla.fdiv.f64(f64, f64)           : .. same<br>

      <br>

    </tt><tt><tt>; this implements the "speculative" load mentioned in

        the quote above (writes %vlen_state. I suppose it also reads it

        first?)</tt><tt><br>

      </tt></tt><tt><scalable 1 x f64> llvm.riscv.probe.f64(%ptr)</tt><tt><br>

    </tt><br>

    <p>By relying on memory dependence, this also implies that

      arithmetic operations can be re-ordered freely as long as

      vlen_state does not change between them (SLP, "loop mix (CGO16)",

      ..).<br>

    </p>

    <p>Regarding function calls, if the callee does not have the

      'inherits_vlen' attribute, the target can use a default value at

      function entry (max width or "undef"). Otherwise, the vector

      length needs to be communicated from caller to callee. However,

      the `vlen_state` variable already achieves that for a first

      implementation.</p>

    <p>Last but not least, thank you all for working on this! I am

      really looking forward to playing around with vla architectures in

      LLVM.<br>

    </p>

    <p>Regards,</p>

    <p>Simon</p>

    <br>

    <div class="moz-cite-prefix">On 07/02/2018 11:53 AM, Graham Hunter

      via llvm-dev wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:7EB91643-A82D-4711-BEA7-544D7E1867BB@arm.com">

      <pre wrap="">Hi,

I've updated the RFC slightly based on the discussion within the thread, reposted below. Let me know if I've missed anything or if more clarification is needed.

Thanks,

-Graham

=============================================================

Supporting SIMD instruction sets with variable vector lengths

=============================================================

In this RFC we propose extending LLVM IR to support code-generation for variable

length vector architectures like Arm's SVE or RISC-V's 'V' extension. Our

approach is backwards compatible and should be as non-intrusive as possible; the

only change needed in other backends is how size is queried on vector types, and

it only requires a change in which function is called. We have created a set of

proof-of-concept patches to represent a simple vectorized loop in IR and

generate SVE instructions from that IR. These patches (listed in section 7 of

this rfc) can be found on Phabricator and are intended to illustrate the scope

of changes required by the general approach described in this RFC.

==========

Background

==========

*ARMv8-A Scalable Vector Extensions* (SVE) is a new vector ISA extension for

AArch64 which is intended to scale with hardware such that the same binary

running on a processor with longer vector registers can take advantage of the

increased compute power without recompilation.

As the vector length is no longer a compile-time known value, the way in which

the LLVM vectorizer generates code requires modifications such that certain

values are now runtime evaluated expressions instead of compile-time constants.

Documentation for SVE can be found at

<a class="moz-txt-link-freetext" href="https://developer.arm.com/docs/ddi0584/latest/arm-architecture-reference-manual-supplement-the-scalable-vector-extension-sve-for-armv8-a">https://developer.arm.com/docs/ddi0584/latest/arm-architecture-reference-manual-supplement-the-scalable-vector-extension-sve-for-armv8-a</a>

========

Contents

========

The rest of this RFC covers the following topics:

1. Types -- a proposal to extend VectorType to be able to represent vectors that

   have a length which is a runtime-determined multiple of a known base length.

2. Size Queries - how to reason about the size of types for which the size isn't

   fully known at compile time.

3. Representing the runtime multiple of vector length in IR for use in address

   calculations and induction variable comparisons.

4. Generating 'constant' values in IR for vectors with a runtime-determined

   number of elements.

5. An explanation of splitting/concatentating scalable vectors.

6. A brief note on code generation of these new operations for AArch64.

7. An example of C code and matching IR using the proposed extensions.

8. A list of patches demonstrating the changes required to emit SVE instructions

   for a loop that has already been vectorized using the extensions described

   in this RFC.

========

1. Types

========

To represent a vector of unknown length a boolean `Scalable` property has been

added to the `VectorType` class, which indicates that the number of elements in

the vector is a runtime-determined integer multiple of the `NumElements` field.

Most code that deals with vectors doesn't need to know the exact length, but

does need to know relative lengths -- e.g. get a vector with the same number of

elements but a different element type, or with half or double the number of

elements.

In order to allow code to transparently support scalable vectors, we introduce

an `ElementCount` class with two members:

- `unsigned Min`: the minimum number of elements.

- `bool Scalable`: is the element count an unknown multiple of `Min`?

For non-scalable vectors (``Scalable=false``) the scale is considered to be

equal to one and thus `Min` represents the exact number of elements in the

vector.

The intent for code working with vectors is to use convenience methods and avoid

directly dealing with the number of elements. If needed, calling

`getElementCount` on a vector type instead of `getVectorNumElements` can be used

to obtain the (potentially scalable) number of elements. Overloaded division and

multiplication operators allow an ElementCount instance to be used in much the

same manner as an integer for most cases.

This mixture of compile-time and runtime quantities allow us to reason about the

relationship between different scalable vector types without knowing their

exact length.

The runtime multiple is not expected to change during program execution for SVE,

but it is possible. The model of scalable vectors presented in this RFC assumes

that the multiple will be constant within a function but not necessarily across

functions. As suggested in the recent RISC-V rfc, a new function attribute to

inherit the multiple across function calls will allow for function calls with

vector arguments/return values and inlining/outlining optimizations.

IR Textual Form

---------------

The textual form for a scalable vector is:

``<scalable <n> x <type>>``

where `type` is the scalar type of each element, `n` is the minimum number of

elements, and the string literal `scalable` indicates that the total number of

elements is an unknown multiple of `n`; `scalable` is just an arbitrary choice

for indicating that the vector is scalable, and could be substituted by another.

For fixed-length vectors, the `scalable` is omitted, so there is no change in

the format for existing vectors.

Scalable vectors with the same `Min` value have the same number of elements, and

the same number of bytes if `Min * sizeof(type)` is the same (assuming they are

used within the same function):

``<scalable 4 x i32>`` and ``<scalable 4 x i8>`` have the same number of

  elements.

``<scalable 4 x i32>`` and ``<scalable 8 x i16>`` have the same number of

  bytes.

IR Bitcode Form

---------------

To serialize scalable vectors to bitcode, a new boolean field is added to the

type record. If the field is not present the type will default to a fixed-length

vector type, preserving backwards compatibility.

Alternatives Considered

-----------------------

We did consider one main alternative -- a dedicated target type, like the

x86_mmx type.

A dedicated target type would either need to extend all existing passes that

work with vectors to recognize the new type, or to duplicate all that code

in order to get reasonable code generation and autovectorization.

This hasn't been done for the x86_mmx type, and so it is only capable of

providing support for C-level intrinsics instead of being used and recognized by

passes inside llvm.

Although our current solution will need to change some of the code that creates

new VectorTypes, much of that code doesn't need to care about whether the types

are scalable or not -- they can use preexisting methods like

`getHalfElementsVectorType`. If the code is a little more complex,

`ElementCount` structs can be used instead of an `unsigned` value to represent

the number of elements.

===============

2. Size Queries

===============

This is a proposal for how to deal with querying the size of scalable types for

analysis of IR. While it has not been implemented in full, the general approach

works well for calculating offsets into structures with scalable types in a

modified version of ComputeValueVTs in our downstream compiler.

For current IR types that have a known size, all query functions return a single

integer constant. For scalable types a second integer is needed to indicate the

number of bytes/bits which need to be scaled by the runtime multiple to obtain

the actual length.

For primitive types, `getPrimitiveSizeInBits()` will function as it does today,

except that it will no longer return a size for vector types (it will return 0,

as it does for other derived types). The majority of calls to this function are

already for scalar rather than vector types.

For derived types, a function `getScalableSizePairInBits()` will be added, which

returns a pair of integers (one to indicate unscaled bits, the other for bits

that need to be scaled by the runtime multiple). For backends that do not need

to deal with scalable types the existing methods will suffice, but a debug-only

assert will be added to them to ensure they aren't used on scalable types.

Similar functionality will be added to DataLayout.

Comparisons between sizes will use the following methods, assuming that X and

Y are non-zero integers and the form is of { unscaled, scaled }.

{ X, 0 } <cmp> { Y, 0 }: Normal unscaled comparison.

{ 0, X } <cmp> { 0, Y }: Normal comparison within a function, or across

                         functions that inherit vector length. Cannot be

                         compared across non-inheriting functions.

{ X, 0 } > { 0, Y }: Cannot return true.

{ X, 0 } = { 0, Y }: Cannot return true.

{ X, 0 } < { 0, Y }: Can return true.

{ Xu, Xs } <cmp> { Yu, Ys }: Gets complicated, need to subtract common

                             terms and try the above comparisons; it

                             may not be possible to get a good answer.

It's worth noting that we don't expect the last case (mixed scaled and

unscaled sizes) to occur. Richard Sandiford's proposed C extensions

(<a class="moz-txt-link-freetext" href="http://lists.llvm.org/pipermail/cfe-dev/2018-May/057830.html">http://lists.llvm.org/pipermail/cfe-dev/2018-May/057830.html</a>) explicitly

prohibits mixing fixed-size types into sizeless struct.

I don't know if we need a 'maybe' or 'unknown' result for cases comparing scaled

vs. unscaled; I believe the gcc implementation of SVE allows for such

results, but that supports a generic polynomial length representation.

My current intention is to rely on functions that clone or copy values to

check whether they are being used to copy scalable vectors across function

boundaries without the inherit vlen attribute and raise an error there instead

of requiring passing the Function a type size is from for each comparison. If

there's a strong preference for moving the check to the size comparison function

let me know; I will be starting work on patches for this later in the year if

there's no major problems with the idea.

Future Work

-----------

Since we cannot determine the exact size of a scalable vector, the

existing logic for alias detection won't work when multiple accesses

share a common base pointer with different offsets.

However, SVE's predication will mean that a dynamic 'safe' vector length

can be determined at runtime, so after initial support has been added we

can work on vectorizing loops using runtime predication to avoid aliasing

problems.

Alternatives Considered

-----------------------

Marking scalable vectors as unsized doesn't work well, as many parts of

llvm dealing with loads and stores assert that 'isSized()' returns true

and make use of the size when calculating offsets.

We have considered introducing multiple helper functions instead of

using direct size queries, but that doesn't cover all cases. It may

still be a good idea to introduce them to make the purpose in a given

case more obvious, e.g. 'requiresSignExtension(Type*,Type*)'.

========================================

3. Representing Vector Length at Runtime

========================================

With a scalable vector type defined, we now need a way to represent the runtime

length in IR in order to generate addresses for consecutive vectors in memory

and determine how many elements have been processed in an iteration of a loop.

We have added an experimental `vscale` intrinsic to represent the runtime

multiple. Multiplying the result of this intrinsic by the minimum number of

elements in a vector gives the total number of elements in a scalable vector.

Fixed-Length Code

-----------------

Assuming a vector type of <4 x <ty>>

``

vector.body:

  %index = phi i64 [ %index.next, %vector.body ], [ 0, %vector.body.preheader ]

  ;; <loop body>

  ;; Increment induction var

  %index.next = add i64 %index, 4

  ;; <check and branch>

``

Scalable Equivalent

-------------------

Assuming a vector type of <scalable 4 x <ty>>

``

vector.body:

  %index = phi i64 [ %index.next, %vector.body ], [ 0, %vector.body.preheader ]

  ;; <loop body>

  ;; Increment induction var

  %vscale64 = call i64 @llvm.experimental.vector.vscale.64()

  %index.next = add i64 %index, mul (i64 %vscale64, i64 4)

  ;; <check and branch>

``

===========================

4. Generating Vector Values

===========================

For constant vector values, we cannot specify all the elements as we can for

fixed-length vectors; fortunately only a small number of easily synthesized

patterns are required for autovectorization. The `zeroinitializer` constant

can be used in the same manner as fixed-length vectors for a constant zero

splat. This can then be combined with `insertelement` and `shufflevector`

to create arbitrary value splats in the same manner as fixed-length vectors.

For constants consisting of a sequence of values, an experimental `stepvector`

intrinsic has been added to represent a simple constant of the form

`<0, 1, 2... num_elems-1>`. To change the starting value a splat of the new

start can be added, and changing the step requires multiplying by a splat.

Fixed-Length Code

-----------------

``

  ;; Splat a value

  %insert = insertelement <4 x i32> undef, i32 %value, i32 0

  %splat = shufflevector <4 x i32> %insert, <4 x i32> undef, <4 x i32> zeroinitializer

  ;; Add a constant sequence

  %add = add <4 x i32> %splat, <i32 2, i32 4, i32 6, i32 8>

``

Scalable Equivalent

-------------------

``

  ;; Splat a value

  %insert = insertelement <scalable 4 x i32> undef, i32 %value, i32 0

  %splat = shufflevector <scalable 4 x i32> %insert, <scalable 4 x i32> undef, <scalable 4 x i32> zeroinitializer

  ;; Splat offset + stride (the same in this case)

  %insert2 = insertelement <scalable 4 x i32> under, i32 2, i32 0

  %str_off = shufflevector <scalable 4 x i32> %insert2, <scalable 4 x i32> undef, <scalable 4 x i32> zeroinitializer

  ;; Create sequence for scalable vector

  %stepvector = call <scalable 4 x i32> @llvm.experimental.vector.stepvector.nxv4i32()

  %mulbystride = mul <scalable 4 x i32> %stepvector, %str_off

  %addoffset = add <scalable 4 x i32> %mulbystride, %str_off

  ;; Add the runtime-generated sequence

  %add = add <scalable 4 x i32> %splat, %addoffset

``

Future Work

-----------

Intrinsics cannot currently be used for constant folding. Our downstream

compiler (using Constants instead of intrinsics) relies quite heavily on this

for good code generation, so we will need to find new ways to recognize and

fold these values.

===========================================

5. Splitting and Combining Scalable Vectors

===========================================

Splitting and combining scalable vectors in IR is done in the same manner as

for fixed-length vectors, but with a non-constant mask for the shufflevector.

The following is an example of splitting a <scalable 4 x double> into two

separate <scalable 2 x double> values.

``

  %vscale64 = call i64 @llvm.experimental.vector.vscale.64()

  ;; Stepvector generates the element ids for first subvector

  %sv1 = call <scalable 2 x i64> @llvm.experimental.vector.stepvector.nxv2i64()

  ;; Add vscale * 2 to get the starting element for the second subvector

  %ec = mul i64 %vscale64, 2

  %ec.ins = insertelement <scalable 2 x i64> undef, i64 %ec, i32 0

  %ec.splat = shufflevector <scalable 2 x i64> %9, <scalable 2 x i64> undef, <scalable 2 x i32> zeroinitializer

  %sv2 = add <scalable 2 x i64> %ec.splat, %stepvec64

  ;; Perform the extracts

  %res1 = shufflevector <scalable 4 x double> %in, <scalable 4 x double> undef, <scalable 2 x i64> %sv1

  %res2 = shufflevector <scalable 4 x double> %in, <scalable 4 x double> undef, <scalable 2 x i64> %sv2

``

==================

6. Code Generation

==================

IR splats will be converted to an experimental splatvector intrinsic in

SelectionDAGBuilder.

All three intrinsics are custom lowered and legalized in the AArch64 backend.

Two new AArch64ISD nodes have been added to represent the same concepts

at the SelectionDAG level, while splatvector maps onto the existing

AArch64ISD::DUP.

GlobalISel

----------

Since GlobalISel was enabled by default on AArch64, it was necessary to add

scalable vector support to the LowLevelType implementation. A single bit was

added to the raw_data representation for vectors and vectors of pointers.

In addition, types that only exist in destination patterns are planted in

the enumeration of available types for generated code. While this may not be

necessary in future, generating an all-true 'ptrue' value was necessary to

convert a predicated instruction into an unpredicated one.

==========

7. Example

==========

The following example shows a simple C loop which assigns the array index to

the array elements matching that index. The IR shows how vscale and stepvector

are used to create the needed values and to advance the index variable in the

loop.

C Code

------

``

void IdentityArrayInit(int *a, int count) {

  for (int i = 0; i < count; ++i)

    a[i] = i;

}

``

Scalable IR Vector Body

-----------------------

``

vector.body.preheader:

  ;; Other setup

  ;; Stepvector used to create initial identity vector

  %stepvector = call <scalable 4 x i32> @llvm.experimental.vector.stepvector.nxv4i32()

  br vector.body

vector.body

  %index = phi i64 [ %index.next, %vector.body ], [ 0, %vector.body.preheader ]

  %0 = phi i64 [ %1, %vector.body ], [ 0, %vector.body.preheader ]

           ;; stepvector used for index identity on entry to loop body ;;

  %vec.ind7 = phi <scalable 4 x i32> [ %step.add8, %vector.body ],

                                     [ %stepvector, %vector.body.preheader ]

  %vscale64 = call i64 @llvm.experimental.vector.vscale.64()

  %vscale32 = trunc i64 %vscale64 to i32

  %1 = add i64 %0, mul (i64 %vscale64, i64 4)

           ;; vscale splat used to increment identity vector ;;

  %insert = insertelement <scalable 4 x i32> undef, i32 mul (i32 %vscale32, i32 4), i32 0

  %splat shufflevector <scalable 4 x i32> %insert, <scalable 4 x i32> undef, <scalable 4 x i32> zeroinitializer

  %step.add8 = add <scalable 4 x i32> %vec.ind7, %splat

  %2 = getelementptr inbounds i32, i32* %a, i64 %0

  %3 = bitcast i32* %2 to <scalable 4 x i32>*

  store <scalable 4 x i32> %vec.ind7, <scalable 4 x i32>* %3, align 4

           ;; vscale used to increment loop index

  %index.next = add i64 %index, mul (i64 %vscale64, i64 4)

  %4 = icmp eq i64 %index.next, %n.vec

  br i1 %4, label %middle.block, label %vector.body, !llvm.loop !5

``

==========

8. Patches

==========

List of patches:

1. Extend VectorType: <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D32530">https://reviews.llvm.org/D32530</a>

2. Vector element type Tablegen constraint: <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D47768">https://reviews.llvm.org/D47768</a>

3. LLT support for scalable vectors: <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D47769">https://reviews.llvm.org/D47769</a>

4. EVT strings and Type mapping: <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D47770">https://reviews.llvm.org/D47770</a>

5. SVE Calling Convention: <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D47771">https://reviews.llvm.org/D47771</a>

6. Intrinsic lowering cleanup: <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D47772">https://reviews.llvm.org/D47772</a>

7. Add VScale intrinsic: <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D47773">https://reviews.llvm.org/D47773</a>

8. Add StepVector intrinsic: <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D47774">https://reviews.llvm.org/D47774</a>

9. Add SplatVector intrinsic: <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D47775">https://reviews.llvm.org/D47775</a>

10. Initial store patterns: <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D47776">https://reviews.llvm.org/D47776</a>

11. Initial addition patterns: <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D47777">https://reviews.llvm.org/D47777</a>

12. Initial left-shift patterns: <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D47778">https://reviews.llvm.org/D47778</a>

13. Implement copy logic for Z regs: <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D47779">https://reviews.llvm.org/D47779</a>

14. Prevectorized loop unit test: <a class="moz-txt-link-freetext" href="https://reviews.llvm.org/D47780">https://reviews.llvm.org/D47780</a>

_______________________________________________

LLVM Developers mailing list

<a class="moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>

<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>

</pre>

    </blockquote>

    <br>

    <pre class="moz-signature" cols="72">-- 

Simon Moll

Researcher / PhD Student

Compiler Design Lab (Prof. Hack)

Saarland University, Computer Science

Building E1.3, Room 4.31

Tel. +49 (0)681 302-57521 : <a class="moz-txt-link-abbreviated" href="mailto:moll@cs.uni-saarland.de">moll@cs.uni-saarland.de</a>

Fax. +49 (0)681 302-3065  : <a class="moz-txt-link-freetext" href="http://compilers.cs.uni-saarland.de/people/moll">http://compilers.cs.uni-saarland.de/people/moll</a></pre>

  </body>

</html>