[PATCH] D136400: [llvm-ocaml] Migrate from naked pointers to prepare for OCaml 5

Josh Berdine via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 24 09:22:50 PST 2023


jberdine added a comment.

Thanks for persisting with this and tolerating all my comments! This looks like it is getting very close, nice! I'm sorry for the piecemeal reviewing, but making a full pass takes longer than I can usually spend at once and I figure it is better to send comments intermittently rather than save them until the end. So far I have spent most time looking at llvm_ocaml.{c,ml,mli} and not so much on the other files, but I think that llvm_ocaml is just about there.

For info I found https://github.com/ocaml/ocaml/blob/f1550145d4ebe8ee22853d13f97fca288aaef51f/runtime/memory.c#L203 which notes that normal assignments are ok to initialize freshly allocated blocks with value that are not pointers to young blocks.



================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:46
 
+value to_val(void *ptr) {
+  assert((((value)ptr) & 1) == 0 &&
----------------
It might not hurt to add a comment before `to_val` and `from_val` saying that they encode pointers to LLVM objects as OCaml tagged integers, following the 3rd encoding suggested by the OCaml FFI documentation at https://ocaml.org/releases/5.0/htmlman/intfc.html#ss:c-outside-head . I don't know how stable such urls are expected to be though.


================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:55
+                       "low bit set so that the OCaml GC "
+                       "treats them as integers");
+  return (void *)(v ^ 1);
----------------



================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:128
 
-static value alloc_variant(int tag, void *Value) {
+void *alloc_temp(value Elements) {
+  unsigned Length = Wosize_val(Elements);
----------------
This function could probably use a comment explaining its usage since its functionality is quite specific and has a general sounding name. I don't feel strongly, but there might be a better name possible. It is basically `from_val` lifted to operate on arrays, so perhaps `from_val_array`.


================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:129
+void *alloc_temp(value Elements) {
+  unsigned Length = Wosize_val(Elements);
+  void **Temp = malloc(sizeof(void *) * Length);
----------------
Note that `Wosize_val` returns type `mlsize_t`, which is defined in a platform-specific way because it needs to be different on 32 vs 64 bit systems (it will be `uint64_t` on 64-bit systems, so usually wider than `unsigned`). It would probably be better to use `mlsize_t` instead of `unsigned` where the values come from OCaml and keep `unsigned` where the values come from LLVM. In cases where both are involved, I think but am not sure that it would generally be clearer to have the local variables with the unconverted types, and leave the casts between `unsigned` and `mlsize_t` implicit when passing arguments to functions.


================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:134
+  for (unsigned I = 0; I < Length; ++I) {
+    Temp[I] = from_val(Field(Elements, I));
+  }
----------------
Just a note to ourselves for maybe later: If we want to make the bindings safe for concurrent clients, we may want something stronger than a non-atomic store here. (The load from `Elements` is `volatile`.)


================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:370-374
+/* llmodule -> string -> unit */
+value llvm_set_module_inline_asm(value M, value Asm) {
+  LLVMSetModuleInlineAsm(Module_val(M), String_val(Asm));
+  return Val_unit;
+}
----------------
No big deal, but I don't know why this function was moved up a few lines


================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:514
+value llvm_function_type(value RetTy, value ParamTys) {
+  CAMLparam1(ParamTys);
+  size_t len = Wosize_val(ParamTys);
----------------
I think this CAMLparam is not needed since alloc_temp allocates off the OCaml heap. And if this one is needed, then RetTy should also be a root.


================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:524
+value llvm_var_arg_function_type(value RetTy, value ParamTys) {
+  CAMLparam1(ParamTys);
+  size_t len = Wosize_val(ParamTys);
----------------
Same as function above


================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:558
+value llvm_struct_type(value C, value ElementTypes) {
+  CAMLparam1(ElementTypes);
+  size_t Length = Wosize_val(ElementTypes);
----------------
here too


================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:568
+value llvm_packed_struct_type(value C, value ElementTypes) {
+  CAMLparam1(ElementTypes);
+  size_t Length = Wosize_val(ElementTypes);
----------------
and here


================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:583
+value llvm_struct_set_body(value Ty, value ElementTypes, value Packed) {
+  CAMLparam1(ElementTypes);
+  unsigned Length = Wosize_val(ElementTypes);
----------------
here


================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:584
+  CAMLparam1(ElementTypes);
+  unsigned Length = Wosize_val(ElementTypes);
+  LLVMTypeRef *Temp = alloc_temp(ElementTypes);
----------------
Related to the comment on `alloc_temp` about `unsigned`, I think it would be clearer if this `unsigned` was instead `mlsize_t`, leaving the case to `unsigned` implicit in the call to `LLVMStructSetBody` below.

There are several other similar uses, I would look around each call to `alloc_temp` and at each call to `Wosize_val`.


================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:1808
+
 /*--... Operations on basic blocks .........................................--*/
 
----------------
Unless reordering the next several definitions is needed, it would be easier to review if they were left in the same order. They can be reordered if desired in a separate diff that only reorders things.


================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:1955-1967
+value llvm_call_site_attrs(value F, value Index) {
+  CAMLparam0();
+  CAMLlocal2(Temp, Array);
+  unsigned Length = LLVMGetCallSiteAttributeCount(Value_val(F), Int_val(Index));
+  Temp = caml_alloc(1, Abstract_tag);
+  Array = caml_alloc_tuple_uninit(Length);
+  LLVMGetCallSiteAttributes(Value_val(F), Int_val(Index),
----------------
Could you double-check this one. `Temp` is allocated on the caml heap unlike other functions, and as written `F` and `Index` will need to be registered as roots.


================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:2180
+value llvm_insertion_block(value B) {
+  CAMLparam1(B);
   LLVMBasicBlockRef InsertBlock = LLVMGetInsertBlock(Builder_val(B));
----------------
jberdine wrote:
> Left over?
Still left over I think


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136400/new/

https://reviews.llvm.org/D136400



More information about the llvm-commits mailing list