[PATCH] D136400: [llvm-ocaml] Migrate from naked pointers to prepare for OCaml 5

Dmitrii Kosarev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Feb 18 11:35:04 PST 2023


Kakadu added a comment.

It looks like most of my comments were false alarms. Although, two places are left, which bother me.



================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:45
+                       "treats them as integers");
+  return (void *)(v ^ 1);
 }
----------------
alan wrote:
> Kakadu wrote:
> > Could you add a comment why these two functions are needed? I always though that out-of-heap values should be wrapped into Abstract_tag. At least the manual says that https://v2.ocaml.org/manual/intfc.html#ss:c-outside-head
> > 
> > Sorry for asking stupid questions, I was teleported there from https://discuss.ocaml.org/t/proposal-care-more-about-ocaml-bindings-for-popular-libraries/11451
> When I first wrote this patch, my original approach was to wrap LLVM values in an OCaml block of Abstract_tag. In a discussion with @jberdine, he said that I could assume that all pointers from LLVM are 2-byte aligned, and set the low bit to 1 when exposing them to OCaml (https://github.com/llvm/llvm-project/issues/58134#issuecomment-1271544642). The OCaml GC distinguishes pointers from integers from the low bit; the low bit of integers is 1. Therefore, the OCaml GC will treat out-of-OCaml-heap LLVM pointers as integers.
> 
> The OCaml documentation that you link also recommends this representation:
> 
> > For pointers that are at least 2-aligned (the low bit is guaranteed to be zero), we have yet another valid representation as an OCaml tagged integer. 
> 
> This approach avoids an extra allocation. This design choice is documented in the comments and assert messages. Is there anything else I need to add to make it clear for people unfamiliar with the code?
Sorry, I forgot that there is an optimization to avoid Abstract_tag.  It would be great to have a comment explicitly saying that. 


================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:61
 
 value cstr_to_string(const char *Str, mlsize_t Len) {
   if (!Str)
----------------
alan wrote:
> Kakadu wrote:
> > Am I understanding right... LLVM uses non-zero-terminated strings under the hood, so this function is need?
> > 
> > In case of zero terminated string it would be better to use caml_copy_string https://github.com/ocaml/ocaml/blob/5.0/runtime/alloc.c#L208
> This code is what exists on the main branch and my patch does not touch it. The code passes the tests; changing this function is not the business of my patch.
> 
> It seems that this function accepts NULL pointers and handles the NULL case by allocating a zero-length OCaml-heap-allocated string.
Yeah, you are right. The caml_alloc_string+memcpy could be probably be replaced by caml_copy_string, but it could be a topic for a distinct patch....


================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:663
+  unsigned Length = LLVMCountStructElementTypes(Type_val(StructTy));
+  LLVMTypeRef *Temp = alloc_temp(Length);
+  LLVMGetStructElementTypes(Type_val(StructTy), Temp);
----------------
alan wrote:
> Kakadu wrote:
> > Could you double check that there is no bug here? See my comment about alloc_temp. 
> This code allocates a temporary, intermediate array as a buffer to store the output of `LLVMGetStructElementTypes`. Then, it copies them over to an OCaml-heap-allocated array (which internally have the same representation as OCaml tuples) and copies over the elements, wrapping them by setting the low bit to 1.
Yes, but it looks like alloc_temp accepts array as an argument, but you pass `unsigned Length` here...


================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:1405
+  CAMLreturn(to_val(Value));
+}
+
----------------
alan wrote:
> Kakadu wrote:
> > I'm wondering... Is it a good idea to define a macro for more short definitions of binary operations above?
> For this patch, I just did things by-hand. I think this is simpler than a macro for now, as a lot of this code wraps functions by hand. If you really want a macro, I can change it to one.
Nonono, this patch is OK without a macro.


================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:3126
   for (Tl = Incoming; Tl != Val_int(0); Tl = Field(Tl, 1)) {
     value Hd = Field(Tl, 0);
+    LLVMValueRef V = Value_val(Field(Hd, 0));
----------------
alan wrote:
> Kakadu wrote:
> > It looks like the 'value' is not required.
> By putting `value`, I define a new `Hd` local variable that shadows the outer one. This code is fine, but reusing the outer variable will also probably be fine. It just depends on your style preference.
Your explanation is fine, if we talk about C++, but OCaml manual explicitly says 
"Rule 2  Local variables of type value must be declared with one of the CAMLlocal macros."


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136400/new/

https://reviews.llvm.org/D136400



More information about the llvm-commits mailing list