<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix">

      <br>

      On 18/11/13 21:43, Brandon Holt wrote:<br>

    </div>

    <blockquote

      cite="mid:A2E71300-0359-4D1C-BB5A-BB8F46FE8696@cs.washington.edu"

      type="cite">

      <meta http-equiv="Content-Type" content="text/html;

        charset=ISO-8859-1">

      <div>I am working on a pass to extract small regions of code to

        run somewhere else (different node in a cluster). Basically what

        I need is the ability to isolate a region of code, get its

        inputs and outputs, create a new function with the extracted

        code and code aggregating the in and out parameters as structs

        that can be cast for a “void*”-based interface.</div>

      <div><br>

      </div>

      It looks like the CodeExtractor

      (include/Transforms/Util/CodeExtractor.h) does nearly all of this,

      with the exceptions that I need to generate a different “call”,

      and I need to be able to separate the outputs and inputs. <br>

    </blockquote>

    <br>

    <br>

    Hi Brandon,<br>

    <br>

    That sounds a lot like what I'm doing. I'm not using the code

    extractor though. Maybe you want to share ideas :)<br>

    <br>

    I have a tool to extract parts of code into new functions based on a

    given partitioning. The inputs to the tool are:<br>

    <br>

        1. The sequential code in LLVM IR (we get this from clang).<br>

        2. A machine file that contains the specification of a physical

    architecture. For example, you can specify a single node with two

    quad-cores. Or a whole cluster with several nodes, each with two

    quad-cores and a FPGA accelerator board.<br>

        3. A file that maps each Basic Block to one of the architecture

    devices (you can also specify a general mapping for convenience, and

    only map a few blocks to your accelerators or different CPUs).<br>

    <br>

    Based on the partitioning and the architecture file, we extract BBs

    into functions and move these to different LLVM modules, one for

    each device of the architecture. Each module is then compiled with a

    machine-specific backend and against a device-specific

    communications library. All the executables can be run in a MIMD

    fashion in a cluster.<br>

    <br>

    The inputs and outputs are handled in two ways:<br>

    <br>

        a) By means of the virtual registers. When these traverse device

    boundaries, they are turned into function parameters. The compiler

    inserts marshalling/unmarshalling code as well as "server" and

    "client" stubs.<br>

        b) By means of explicit prefetching (which we plan to

    compiler-automate in the future as well). This is used for data

    structures and dynamic memory. Essentially, things that need a

    "getelementptr" at some point.<br>

    <br>

    I never made this code available because it's still a research

    thing, but your question awoke my interest. ¿Could you elaborate on

    what you intend to do?<br>

    <br>

    Cheers<br>

    <pre class="moz-signature" cols="72">-- 

Pablo Barrio

Dpt. Electrical Engineering - Technical University of Madrid

Office C-203

Avda. Complutense s/n, 28040 Madrid

Tel. (+34) 915495700 ext. 4234

@: <a class="moz-txt-link-abbreviated" href="mailto:pbarrio@die.upm.es">pbarrio@die.upm.es</a>

</pre>

    <br>

  </body>

</html>