[llvm-commits] [PATCH] Add bindings for walking the instruction stream.

Gordon Henriksen gordonhenriksen at mac.com
Thu Mar 20 13:50:43 PDT 2008


Hi Erick,

Thanks, good analysis. I have some feedback below and inline.

On 2008-03-20, at 02:59, Erick Tryzelaar wrote:

> Use some of the new features that gordonh provided. However, I don't  
> think we need to use everything:
>
> Since the LLVMGetLastInstruction actually returns one past the last  
> instruction that is mainly used for c++style iteration. Since we  
> already have to protect against next_instr and etc from returning a  
> null pointer, we can already model "last_instr" as "None".

That's not correct; that function doesn't return an iterator, it  
actually returns the last instruction (or null if the block is empty).  
This is the tip of the tail of the doubly linked list (≠ List.tl),  
and is the terminator instruction—very important to following the CFG.

> I don't think we need position_builder as proposed. It's more of a  
> convenience function so that you don't have to write "match  
> next_instr i with ...", but I don't think you need it that often,  
> and it's trivial to write a function in the client to do that for us.
>

> I've added a couple wrapper scripts that (I think) would be more  
> common, position_at_start and position_after, that will also safely  
> handle builders without any instructions in them.

position_builder b bb (first_instr bb) and position_builder b bb  
(last_instr bb) handle empty blocks correctly. Both the *_instr  
functions return None, making position_builder use the only valid  
iterator (unwrap(bb)->begin() == unwrap(bb)->end() by definition).

> Finally, I migrated a couple functions from llvm_ocaml.c to llvm.ml.  
> While this probably doesn't make a difference, it might allow ocaml  
> to do some more optimizations.
>
> diff --git bindings/ocaml/llvm/llvm.ml bindings/ocaml/llvm/llvm.ml
> index ac05a4d..37a11fe 100644
> --- bindings/ocaml/llvm/llvm.ml
> +++ bindings/ocaml/llvm/llvm.ml
> @@ -335,6 +335,35 @@ external insert_block : string -> llbasicblock - 
> > llbasicblock
>
>  (*--... Operations on  
> instructions .........................................--*)
>  external instr_parent : llvalue -> llbasicblock =  
> "LLVMGetInstructionParent"
> +external instr_parent : llvalue -> llbasicblock =  
> "LLVMGetInstructionParent"

?

> +external first_instr: llbasicblock -> llvalue option =  
> "llvm_first_instr"

last_instr bb is the terminator and quite important for navigating the  
CFG.

Actually, analysis and transformation algorithms would be well-served  
initial_instr/terminator bindings which raise Not_found instead of  
yielding None. This would serve well such clients since they would not  
have to explicitly handle the case of an empty block or one with with  
no terminator, neither of which occur in valid IR.

It would not serve well my position_builder use case, so perhaps there  
should be redundant bindings to serve the two needs? Hm.

> +external next_instr: llvalue -> llvalue option = "llvm_next_instr"
> +external prev_instr: llvalue -> llvalue option = "llvm_prev_instr"
> +
> +let iter_block f bb =
> +  let rec aux i =
> +    match i with
> +    | None -> ()
> +    | Some next -> f next; aux (next_instr next)
> +  in
> +  aux (first_instr bb)
> +
> +let fold_left_block f init bb =
> +  let rec aux init i =
> +    match i with
> +    | None -> init
> +    | Some next -> aux (f init next) (next_instr next)
> +  in
> +  aux init (first_instr bb)

Nice.

> +
> +let fold_right_block f bb init =
> +  let rec aux i init =
> +    match i with
> +    | None -> init
> +    | Some next -> f next (aux (next_instr next) init)
> +  in
> +  aux (first_instr bb) init
> +

This isn't tail recursive but it could be if you bound  
LLVMGetLastInstruction and used prev_instr.

>
>  (*--... Operations on call  
> sites ...........................................--*)
>  external instruction_call_conv: llvalue -> int
> @@ -349,14 +378,42 @@ external incoming : llvalue -> (llvalue *  
> llbasicblock) list = "llvm_incoming"
>
>
>  (*===-- Instruction builders  
> ----------------------------------------------===*)
> -external builder: unit-> llbuilder = "llvm_builder"
> -external builder_before : llvalue -> llbuilder =  
> "llvm_builder_before"
> -external builder_at_end : llbasicblock -> llbuilder =  
> "llvm_builder_at_end"
> +external builder : unit -> llbuilder = "llvm_builder"
>  external position_before : llvalue -> llbuilder -> unit =  
> "llvm_position_before"
>  external position_at_end : llbasicblock -> llbuilder -> unit
>                           = "llvm_position_at_end"
>  external insertion_block : llbuilder -> llbasicblock =  
> "llvm_insertion_block"
>
> +let position_after i b =
> +  match next_instr i with
> +  | Some i -> position_before i b
> +  | None -> position_at_end (insertion_block b) b
> +
> +let position_at_start bb b =
> +  match first_instr bb with
> +  | Some i -> position_before i b
> +  | None -> position_at_end bb b
> +
> +let builder_before i =
> +  let b = builder () in
> +  position_before i b;
> +  b
> +
> +let builder_after i =
> +  let b = builder () in
> +  position_after i b;
> +  b
> +
> +let builder_at_start bb =
> +  let b = builder () in
> +  position_at_start bb b;
> +  b
> +
> +let builder_at_end bb =
> +  let b = builder () in
> +  position_at_end bb b;
> +  b
> +

I was trying to avoid this n x m explosion, although it is probably  
bounded at 8 = {builder,position}_{before,after,at_end,at_start}. In  
its place, I advocate the composable concepts of:

IR navigation (first, next, last, prev). This is useful in other  
contexts.
A pair of builder positioning functions (builder_at, position_builder)  
that are as expressive as LLVMBuilder(BasicBlock*,  
BasicBlock::iterator).

Your first e-mail implied you were for such a concept, although I  
think you over engineered it slightly. A perfect functional concept  
for LLVM's ilist iterator would be thus:

type 'a 'b llposition =
| Before of 'a
| At_end of 'b

With this, position_builder would simply have type llvalue  
llbasicblock llposition -> llbuilder -> unit and I would seriously  
consider deleting {builder,position}_{at_end,before} since they are  
trivial:

external builder : unit -> llbuilder
external position_builder : llvalue llbasicblock llposition -> llbuilder

let builder_at ip =
   let b = builder () in
   position_builder ip b;
   b

let builder_at_end bb = builder_at (At_end bb)
let builder_before i = builder_at (Before i)
let builder_at_start bb = builder_at (first_instr bb)
let builder_after i = builder_at (succ_instr i)
let position_at_end bb = position_builder (At_end bb)
let position_before i = position_builder (Before i)
let position_at_start bb = position_builder (first_instr bb)
let position_after i = position_builder (succ_instr i)

>  (*--...  
> Terminators 
>  ........................................................--*)
>  external build_ret_void : llbuilder -> llvalue =  
> "llvm_build_ret_void"
>  external build_ret : llvalue -> llbuilder -> llvalue =  
> "llvm_build_ret"
> diff --git bindings/ocaml/llvm/llvm.mli bindings/ocaml/llvm/llvm.mli
> index 5996ecd..854f6e5 100644
> --- bindings/ocaml/llvm/llvm.mli
> +++ bindings/ocaml/llvm/llvm.mli
> @@ -898,6 +898,26 @@ external block_of_value : llvalue ->  
> llbasicblock = "LLVMValueAsBasicBlock"
>      See the method [llvm::Instruction::getParent]. *)
>  external instr_parent : llvalue -> llbasicblock =  
> "LLVMGetInstructionParent"
>
> +(** [first_instr bb] is the first instruction of a basic block. *)
> +external first_instr: llbasicblock -> llvalue option =  
> "llvm_first_instr"

s/a basic block/the basic block [bb]/
Please mention the None case.
Likewise in other instances.

> +
> +(** [next_instr bb] is the instruction after 'i'. *)

s/bb/i/
s/'i'/the instruction [i]/
Likewise below.

> +external next_instr: llvalue -> llvalue option = "llvm_next_instr"
> +
> +(** [prev_instr bb] is the instruction after 'i'. *)
> +external prev_instr: llvalue -> llvalue option = "llvm_prev_instr"
> +
> +(** [iter_block bb] iterates over all the instructions in the basic  
> block. *)
> +val iter_block: (llvalue -> unit) -> llbasicblock -> unit
> +
> +(** [fold_left_block f bb] folds left the function f over all the  
> instructions
> +    in the basic block. *)
> +val fold_left_block: ('a -> llvalue -> 'a) -> 'a -> llbasicblock ->  
> 'a

s/the function/the function [f]/
Likewise below.

> +
> +(** [fold_right_block bb f] folds right the function f over all the  
> instructions
> +    in the basic block. *)
> +val fold_right_block: (llvalue -> 'a -> 'a) -> llbasicblock -> 'a - 
> > 'a
> +
>
>  (** {7 Operations on call sites} *)
>
> @@ -940,16 +960,33 @@ external builder: unit-> llbuilder
>
>  (** [builder_before ins] creates an instruction builder positioned  
> before the
>      instruction [isn]. See the constructor for [llvm::LLVMBuilder].  
> *)
> -external builder_before : llvalue -> llbuilder =  
> "llvm_builder_before"
> +val builder_before : llvalue -> llbuilder
> +
> +(** [builder_after ins] creates an instruction builder positioned  
> after the
> +    instruction [isn]. See the constructor for [llvm::LLVMBuilder].  
> *)
> +val builder_after: llvalue -> llbuilder
> +
> +(** [builder_at_start bb] creates an instruction builder positioned  
> at the start
> +    of the basic block [bb]. See the constructor for  
> [llvm::LLVMBuilder]. *)
> +val builder_at_start: llbasicblock -> llbuilder
>
>  (** [builder_at_end bb] creates an instruction builder positioned  
> at the end of
>      the basic block [bb]. See the constructor for  
> [llvm::LLVMBuilder]. *)
> -external builder_at_end : llbasicblock -> llbuilder =  
> "llvm_builder_at_end"
> +val builder_at_end : llbasicblock -> llbuilder
>
>  (** [position_before ins b] moves the instruction builder [b] to  
> before the
>      instruction [isn]. See the method  
> [llvm::LLVMBuilder::SetInsertPoint]. *)
>  external position_before : llvalue -> llbuilder -> unit =  
> "llvm_position_before"
>
> +(** [position_after ins b] moves the instruction builder [b] to  
> after the
> +    instruction [isn]. See the method  
> [llvm::LLVMBuilder::SetInsertPoint]. *)
> +val position_after : llvalue -> llbuilder -> unit
> +
> +(** [position_at_start bb b] moves the instruction builder [b] to  
> the start of
> +    the of basic block [bb]. See the method
> +    [llvm::LLVMBuilder::SetInsertPoint]. *)
> +val position_at_start : llbasicblock -> llbuilder -> unit
> +
>  (** [position_at_end bb b] moves the instruction builder [b] to the  
> end of the
>      basic block [bb]. See the method  
> [llvm::LLVMBuilder::SetInsertPoint]. *)
>  external position_at_end : llbasicblock -> llbuilder -> unit
> diff --git bindings/ocaml/llvm/llvm_ocaml.c bindings/ocaml/llvm/ 
> llvm_ocaml.c
> index c966091..eb9bda3 100644
> --- bindings/ocaml/llvm/llvm_ocaml.c
> +++ bindings/ocaml/llvm/llvm_ocaml.c
> @@ -660,6 +660,44 @@ CAMLprim value llvm_value_is_block(LLVMValueRef  
> Val) {
>    return Val_bool(LLVMValueIsBasicBlock(Val));
>  }
>
> +/*--... Operations on  
> instructions .........................................--*/
> +
> +/* llbasicblock -> llvalue option */
> +CAMLprim value llvm_first_instr(LLVMBasicBlockRef BB) {
> +  CAMLparam0();
> +  LLVMValueRef Instr;
> +  if ((Instr = LLVMGetFirstInstruction(BB))) {
> +    value Option = alloc(1, 0);
> +    Field(Option, 0) = (value) Instr;
> +    CAMLreturn(Option);
> +  }
> +  CAMLreturn(Val_int(0));
> +}
> +
> +/* llbasicblock -> llvalue option */
> +CAMLprim value llvm_next_instr(LLVMValueRef I) {
> +  CAMLparam0();
> +  LLVMValueRef Instr;
> +  if ((Instr = LLVMGetNextInstruction(I))) {
> +    value Option = alloc(1, 0);
> +    Field(Option, 0) = (value) Instr;
> +    CAMLreturn(Option);
> +  }
> +  CAMLreturn(Val_int(0));
> +}
> +
> +/* llbasicblock -> llvalue option */
> +CAMLprim value llvm_prev_instr(LLVMValueRef I) {
> +  CAMLparam0();
> +  LLVMValueRef Instr;
> +  if ((Instr = LLVMGetPreviousInstruction(I))) {
> +    value Option = alloc(1, 0);
> +    Field(Option, 0) = (value) Instr;
> +    CAMLreturn(Option);
> +  }
> +  CAMLreturn(Val_int(0));
> +}
> +

Looks good.

>  /*--... Operations on call  
> sites ...........................................--*/
>
>  /* llvalue -> int */
> @@ -735,20 +773,6 @@ CAMLprim value llvm_builder(value Unit) {
>    return alloc_builder(LLVMCreateBuilder());
>  }
>
> -/* llvalue -> llbuilder */
> -CAMLprim value llvm_builder_before(LLVMValueRef Inst) {
> -  LLVMBuilderRef B = LLVMCreateBuilder();
> -  LLVMPositionBuilderBefore(B, Inst);
> -  return alloc_builder(B);
> -}
> -
> -/* llbasicblock -> llbuilder */
> -CAMLprim value llvm_builder_at_end(LLVMBasicBlockRef BB) {
> -  LLVMBuilderRef B = LLVMCreateBuilder();
> -  LLVMPositionBuilderAtEnd(B, BB);
> -  return alloc_builder(B);
> -}
> -

Nice.

>  /* llvalue -> llbuilder -> unit */
>  CAMLprim value llvm_position_before(LLVMValueRef Inst, value B) {
>    LLVMPositionBuilderBefore(Builder_val(B), Inst);
>

Thanks Erick!

— Gordon

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20080320/d9ec70a8/attachment.html>


More information about the llvm-commits mailing list