[llvm-commits] [PATCH] Add bindings for walking the instruction stream.

Erick Tryzelaar idadesub at users.sourceforge.net
Fri Mar 21 00:22:52 PDT 2008


On Thu, Mar 20, 2008 at 1:50 PM, Gordon Henriksen
<gordonhenriksen at mac.com> wrote:
> That's not correct; that function doesn't return an iterator, it actually
> returns the last instruction (or null if the block is empty). This is the
> tip of the tail of the doubly linked list (≠ List.tl), and is the terminator
> instruction—very important to following the CFG.

Oh, my mistake. Alright, I'll put it back in :)

> position_builder b bb (first_instr bb) and position_builder b bb (last_instr
> bb) handle empty blocks correctly. Both the *_instr functions return None,
> making position_builder use the only valid iterator (unwrap(bb)->begin() ==
> unwrap(bb)->end() by definition).

Oh sure. I just don't know which would be more likely to use,
positioning the builder where we already have the instruction, or just
using the "Some i" that we'd get from, say, next_instr?

I just grepped through all the llvm and clang code, and there were
literally 13 references to "SetInsertPoint", including the
definitions. So what does everyone do instead?

>>  (*--... Operations on instructions
>> .........................................--*)
>>  external instr_parent : llvalue -> llbasicblock =
>> "LLVMGetInstructionParent"
>> +external instr_parent : llvalue -> llbasicblock =
>> "LLVMGetInstructionParent"
>
> ?

Another typo :)

> Actually, analysis and transformation algorithms would be well-served
> initial_instr/terminator bindings which raise Not_found instead of yielding
> None. This would serve well such clients since they would not have to
> explicitly handle the case of an empty block or one with with no terminator,
> neither of which occur in valid IR.

Is that true? Because what if you're at the terminator and you take
"next_instr" on it? Another thing that may or may not be interesting
is that we could plug the "next_instr" into an ocaml stream (they've
got a constructor for unit -> 'a option), and use the camlp4 pattern
matching on it. We'd have to figure out how to pattern match
instructions, though.

> +let fold_right_block f bb init =
> +  let rec aux i init =
> +    match i with
> +    | None -> init
> +    | Some next -> f next (aux (next_instr next) init)
> +  in
> +  aux (first_instr bb) init
> +
>
> This isn't tail recursive but it could be if you bound
> LLVMGetLastInstruction and used prev_instr.

Good point. Do you think it'd be useful to also be able to iterate
from an arbitrary starting position and to iterate in reverse?

> I was trying to avoid this n x m explosion, although it is probably bounded
> at 8 = {builder,position}_{before,after,at_end,at_start}. In its place, I
> advocate the composable concepts of:
>
> IR navigation (first, next, last, prev). This is useful in other contexts.
> A pair of builder positioning functions (builder_at, position_builder) that
> are as expressive as LLVMBuilder(BasicBlock*, BasicBlock::iterator).
>
> Your first e-mail implied you were for such a concept, although I think you
> over engineered it slightly. A perfect functional concept for LLVM's ilist
> iterator would be thus:
>
> type 'a 'b llposition =
> | Before of 'a
> | At_end of 'b

I actually changed my mind against this, because first, it'll be a
little more complicated to pull the data out of a variant in C, and
second, the usage of it would be either: "position_at (Before i)" or
"position_before i" which is nearly the same. The only advantage I see
is if we were to offer more variants, but I don't think we need to.
The primitives "position_before" and "position_at_end" really
represent all possibilities, so we'd only have to bind those two. It'd
also be faster than having to box and unbox instructions. Everything
else is convenience.

Now, we could add a convenience function that was like:

type 'a 'b llposition =
  | Before of 'a
  | After of 'a
  | At_front of 'b
  | At_end of 'b

val position_builder: llbuilder -> (llvalue, llbasicblock) llposition -> unit

Which would use "position_before" and "position_at_end" to capture all
the variants. Is that what you intended?

One thing though is that this still won't let you call
"position_builder b (first_instr bb)", you'd have to unbox the option
and rebox in "position_builder b (At_front i)". Now we could make the
type:

type 'a 'b llposition =
  | Before of 'a option
  | After of 'a option
  | At_front of 'b option
  | At_end of 'b option

But then it seems to be getting a little overly complicated.

> Thanks Erick!

Thanks for the review!




More information about the llvm-commits mailing list