[PATCH] D60242: Add IR support, ELF section and user documentation for partitioning feature.

Thu Apr 4 15:39:37 PDT 2019

pcc added inline comments.

================
Comment at: lld/docs/Partitions.rst:10
+
+LLD's partitioning feature allows a program to be split into multiple pieces,
+or partitions. A partitioned program consists of a main partition together
----------------
peter.smith wrote:
> Program is an interesting choice of word here. From the initial RFC I think this is largely aimed at executables. I don't think that it would be impossible to make this work with shared-libraries, however I'd expect the constraints would make it difficult to use effectively. I'm thinking of an executable calling a function in the shared library that happens to be in a partition not loaded.
> 
> Maybe replace program with executable if that is the only option.
It can be used with both executables and shared libraries. Indeed, the native parts of Android apps are shared libraries, so the shared library case is important to me. I'll clarify this in the doc.

In your executable calling a shared library scenario, the executable is linked separately from the shared library, and depends on it in the usual way (either DT_NEEDED or dlopen), correct? In this case, I don't think it would be possible for such a call to happen. To support the intended usage model of looking up partition entry points by name, the dynamic symbol table of each partition (including the main partition) defines just the symbols for that partition.

One thing that could cause problems is if the executable had a DT_NEEDED dependency on a loadable partition, as this may result in the main partition being loaded at the wrong relative address. In this case, I think it should be possible for a dynamic loader to make this work by using something like the dynamic tag that I mentioned in my initial proposal.

================
Comment at: lld/docs/Partitions.rst:15
+but unlike a shared object, the main partition and the loadable partitions
+share a virtual address space, and each loadable partition is assigned a
+fixed load address relative to the load address of the main partition. This
----------------
ruiu wrote:
> share a virtual address space at link time?
Yes, that might be the best way to phrase it. The partitions share a virtual address space in the p_vaddr sense, but that could probably be confused with the runtime virtual address space which is shared by everything.

================
Comment at: lld/docs/Partitions.rst:27-31
+could be a group of functions that expose the functionality implemented by
+the partition. The intent is that in order to use a loadable partition, the
+program will use ``dlopen``/``dlsym`` or similar functions to dynamically
+load the partition at its assigned address, look up an entry point by name
+and call it. Note, however, that the standard ``dlopen`` function does not
----------------
ruiu wrote:
> Each partition's entry point address is fixed at link-time, so I wonder why you chose to get the address of the entry point by name. If you use the entry point by address, you can possibly remove the symbol table, no?
A couple of reasons:
- This forces users to explicitly load the partition in order to access the entry point. If the entry point addresses were available statically, it would be easier to accidentally refer to a partition's entry point before it is loaded.
- In some cases, it's desirable for the entry points to be findable by name in the dynamic symbol table. For example, this is how JNI finds implementations of native methods; see: https://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/design.html#wp615 which could be very useful for Android apps.

================
Comment at: lld/docs/Partitions.rst:65
+
+  $ clang -ffunction-sections -fdata-sections -c main.c
+  $ clang -ffunction-sections -fdata-sections -fsymbol-partition=libfeature.so -c feature.c
----------------
peter.smith wrote:
> If I've understood it correctly you have to know ahead of time which source files are in which partition, the linker won't automatically create these for you. May be worth putting a shell comment before the main.c and before the feature.c emphasising that these are likely to be two separate build steps. I'm thinking that a common tool flow would be to build each partition into a separate static archive.
Done. Note that it might not work for them to be archives because that gives no guarantee that the archive members will be added to the link, especially not if there won't be direct references. (The same thing applies to anything else that might need to be exported from a shared object.) In Chromium we're going to use a source_set [1] for the partition entry points.

[1] https://gn.googlesource.com/gn/+/refs/heads/master/docs/reference.md#func_source_set

================
Comment at: lld/docs/Partitions.rst:67
+  $ clang -ffunction-sections -fdata-sections -fsymbol-partition=libfeature.so -c feature.c
+  $ clang main.o feature.o -fuse-ld=lld -shared -o libcombined.so -Wl,-soname,libmain.so -Wl,--gc-sections
+  $ llvm-objcopy libcombined.so libmain.so --extract-main-partition
----------------
peter.smith wrote:
> Just a thought, is --gc-sections compulsory? I don't think that it is but if you are hanging this off the --gc-sections it might be easier to automatically enable it if LLD encounters a partition. Or maybe have a --combined-object option that forces --gc-sections, and if not present drops all the partition information and produces a regular output?
> Just a thought, is --gc-sections compulsory?

It technically isn't, but the behaviour without it isn't all that useful. With my current implementation, what I think will happen is that the linker will put almost everything in the main partition, and the loadable partitions will just contain symbol tables (whose symbols will point into the main partition). So it might indeed be a good idea to enable --gc-sections if partitions are encountered.

> Or maybe have a --combined-object option that forces --gc-sections, and if not present drops all the partition information and produces a regular output?

We might want an option for controlling whether to create a combined output file but I don't think we should have that to start with and I don't think we should be dropping the partition specifications by default. Similar to my point in http://lists.llvm.org/pipermail/llvm-dev/2019-March/131015.html , leaving them in would allow downstream tools to detect the case where partition specifications were present but not acted upon.

================
Comment at: lld/docs/Partitions.rst:70
+  $ llvm-objcopy libcombined.so libfeature.so --extract-partition=libfeature.so
+
+In order to allow a program to discover the names of its loadable partitions
----------------
peter.smith wrote:
> This is an llvm-objcopy thought, perhaps a --extract-partition=all option would be useful. In this case the name of the partition would be the name of the filename as in your example above.
It might occasionally be useful (though I'd probably name it --extract-all-partitions to avoid the (admittedly corner case) ambiguity between extracting all partitions and the partition named "all"), but since in most cases, the call to llvm-objcopy will be integrated into a build system which will know exactly which partitions are being created (and should be able to schedule the llvm-objcopy commands in parallel), combined with the fact that this feature can be relatively easily implemented in a shell script, it doesn't seem that important to me.

================
Comment at: lld/docs/Partitions.rst:83
+
+The ``name_relptr`` field is a relative pointer to a null-terminated string
+containing the soname of the partition, the ``addr_relptr`` field is a
----------------
peter.smith wrote:
> Could offset be used instead of relptr?
Yes, that works too.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60242/new/

https://reviews.llvm.org/D60242