[PATCH] D136400: [llvm-ocaml] Migrate from naked pointers to prepare for OCaml 5

Alan via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Feb 19 06:04:50 PST 2023


alan added inline comments.


================
Comment at: llvm/bindings/ocaml/llvm/llvm_ocaml.c:45
+                       "treats them as integers");
+  return (void *)(v ^ 1);
 }
----------------
jberdine wrote:
> jrtc27 wrote:
> > Kakadu wrote:
> > > alan wrote:
> > > > Kakadu wrote:
> > > > > Could you add a comment why these two functions are needed? I always though that out-of-heap values should be wrapped into Abstract_tag. At least the manual says that https://v2.ocaml.org/manual/intfc.html#ss:c-outside-head
> > > > > 
> > > > > Sorry for asking stupid questions, I was teleported there from https://discuss.ocaml.org/t/proposal-care-more-about-ocaml-bindings-for-popular-libraries/11451
> > > > When I first wrote this patch, my original approach was to wrap LLVM values in an OCaml block of Abstract_tag. In a discussion with @jberdine, he said that I could assume that all pointers from LLVM are 2-byte aligned, and set the low bit to 1 when exposing them to OCaml (https://github.com/llvm/llvm-project/issues/58134#issuecomment-1271544642). The OCaml GC distinguishes pointers from integers from the low bit; the low bit of integers is 1. Therefore, the OCaml GC will treat out-of-OCaml-heap LLVM pointers as integers.
> > > > 
> > > > The OCaml documentation that you link also recommends this representation:
> > > > 
> > > > > For pointers that are at least 2-aligned (the low bit is guaranteed to be zero), we have yet another valid representation as an OCaml tagged integer. 
> > > > 
> > > > This approach avoids an extra allocation. This design choice is documented in the comments and assert messages. Is there anything else I need to add to make it clear for people unfamiliar with the code?
> > > Sorry, I forgot that there is an optimization to avoid Abstract_tag.  It would be great to have a comment explicitly saying that. 
> > Pointers are not integers.
> > Pointers are not integers.
> 
> From the perspective of the FFI, bit patterns with the lsb set are not interpreted as pointers that the runtime system should follow. It happens that integers in OCaml are represented with bit patterns where the lsb is set, so it is common, perhaps imprecise, usage to refer to pointers where their lsb has been set as "integers".
>From the perspective of program analysis, pointers are not integers because pointers have provenance. From the perspective of the runtime system, everything is a word. To be pedantic: A word is a sequence of bits; pointers, integers, Booleans, and nullary data constructors are all possible meanings ascribed to words. OCaml's GC, being a precise GC, distinguishes between pointers, which it may follow and relocate, and non-pointers, through the low bit. Setting the low bit of a pointer originating from LLVM causes the OCaml GC to treat it as a word that the GC does not follow.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136400/new/

https://reviews.llvm.org/D136400



More information about the llvm-commits mailing list