[PATCH] D144596: Add extra parameter to thinlto-prefix-replace

Teresa Johnson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 14 16:49:19 PDT 2023


tejohnson added a comment.

In D144596#4194714 <https://reviews.llvm.org/D144596#4194714>, @itf wrote:

> In D144596#4186838 <https://reviews.llvm.org/D144596#4186838>, @tejohnson wrote:
>
>> In D144596#4182584 <https://reviews.llvm.org/D144596#4182584>, @MaskRay wrote:
>>
>>> To support tree artifacts with an unknown number of full bitcode files and minimized bitcode files, we need to use different directories.
>>> If these bitcode files are within the same directory, Bazel will have to ship both full and minimized bitcode files to a backend compile action, causing waste.
>>> To ship just full bitcode files in a backend compile action, full and minimized bitcode files need to be in different directories (e.g. `a.bc` and `indexing/a.bc`).
>>
>> I'm very confused as this is not mentioned anywhere in this patch by @itf. Normally, at least without tree artifacts, we only ship the specified input files, not the entire directory. The stated motivation for this particular patch, which was unrelated to the input files of the indexing action, was:
>>
>> ---
>>
>>> Bazel tree artifacts require that the folders are "write once". Therefore we cannot insert the link objects in the same folder as the result from the thinlto indexing step. Currently the file "--thinlto-index-only=file" requires the link objects to be in the same folder as the summary+import files from the indexing step.
>>>
>>> This change allows having a different prefix for the link objects, and therefore keep the folders containing the summary+import files be "write once".
>>
>> ---
>>
>> @itf can you clarify whether the input object handling is different than for non-tree artifacts as well?
>
> When we have an action in bazel that may take multiple tree artifacts as an input, such as the indexing or final linking steps, the action takes as input all the files inside the directory of the tree artifact or none of them. Therefore, in order to have only the thin bitcode be an input tothe indexing step, but not the full bit code, we will need to store the full bitcode and the thin one in different places.

Ok, thanks for confirming.

> This was not originally part of this proposal because I intended to submit a second patch with a new argument such as "--thinlto-object-prefix-replace". However, after discussing,  @MaskRay suggested to simply add a 4th string to `--thinlto-prefix-replace=`, such as  `--thinlto-prefix-replace='thin/;index/;obj/;indexing/'` instead of adding a brand new argument, and I agreed that this was a cleaner solution.
>
> This 4th string can be added to this current patch or to a following patch. I can see arguments for either: because we are modifying the same argument, it should be in this patch. But because they fix different issues, they should be in different patches.

I'd prefer to do that in a separate patch and keep this one about the native object output directory.

We'll also want to deprecate --thinlto-object-suffix-replace (I'm not sure what the current deprecation policy is, @MaskRay may know).

>> Regardless, I would prefer to keep this patch about the original issue of the write-once output folders. Additional refinements to the input files can be discussed in follow on patches.
>>
>>> While we're here, let's make two more changes. First, let's use `.bc` instead of `.o` for the bitcode file extension name.
>>
>> I'm not in favor of hardcoding the extension in the compiler, it should be left to the build system, but I don't believe that what you are proposing would require a specific extension anyway. Currently, the object output file extension of the compile action is not a hardcoded requirement and I don't think it should be. Additionally, when we were initially implementing ThinLTO I recall that the Bazel team specifically wanted to keep the .o extension (otherwise the handling gets a bit trickier for them as a compile action will have different extensions depending on the optimization options). But like I said it doesn't appear to me that what you are proposing would require this.
>>
>>> Second, let's place the initially compiled bitcode files in `thin/` instead of the current working directory for clarity.
>>
>> Again, this should be decided in the build system, not by the compiler. But it also isn't clear to me that your proposal requires a specific name, just that your proposal is that we require the full and minimized bitcode to be in separate directories and supply the mapping between those instead of the mapping between the extensions. As mentioned above, I would like to understand from @itf if that is needed for tree artifacts, and discuss this in a separate patch, as it seems more like a refinement and less like a requirement, and is orthogonal to this particular change. It may also require additional changes to the build system.
>>
>> In D144596#4182628 <https://reviews.llvm.org/D144596#4182628>, @MaskRay wrote:
>>
>>> In D144596#4178243 <https://reviews.llvm.org/D144596#4178243>, @tejohnson wrote:
>>>
>>>> In D144596#4176555 <https://reviews.llvm.org/D144596#4176555>, @MaskRay wrote:
>>>>
>>>>> I find the summary difficult to follow and I believe users will be more confusing than me (as a ThinLTO contributor).
>>>>
>>>> I'll take a stab at suggesting a clearer summary in a little bit.
>>>
>>> Thanks!
>>
>> Got busy and forgot to do so. I suggest something like the following:
>>
>> ---
>>
>> Title: [lld] Support separate native object file path in --thinlto-prefix-replace
>>
>> Summary:
>> Currently, the `--thinlto-prefix-replace="oldpath;newpath"` option is used during
>> distributed ThinLTO thin links to specify the mapping of the input bitcode object
>> files' directory tree (oldpath) to the directory tree (newpath) used for both:
>>
>>   1) the output files of the thin link itself (the .thinlto.bc index files and the
>>   optional .imports files)
>>   2) the specified object file paths written to the response file given in the
>>   --thinlto-index-only=${response} option, which is used by the final native
>>   link and must match the paths of the native object files that will be
>>   produced by ThinLTO backend compiles.
>>
>> This patch expands the `--thinlto-prefix-replace` option to allow a separate directory
>> tree mapping to be specified for the object file paths written to the response file
>> (number 2 above). This is important to support builds and build systems where the
>> same output directory may not be written by multiple build actions (e.g. the thin link
>> and the ThinLTO backend compiles).
>>
>> The new format is: `--thinlto-prefix-replace="origpath;outpath[;objpath]"`
>>
>> This replaces the `origpath` directory tree of the thin link input files with
>> `outpath` when writing the thin link index and imports outputs (number 1
>> above). If `objpath` is specified it replaces `origpath` of the input files with
>> `objpath` when writing the response file (number 2 above), otherwise it
>> falls back to the old behavior of using `outpath` for this as well.
>>
>> ---
>
> In the end, the new format should be:
>
> ` --thinlto-prefix-replace="origpath;outpath[;objpath][;indexpath]"` with 4 arguments, in some order.  the new `objpath` allows us to have each directory be the output of a different action, and `;indexpath`  allows us to have  each directory be either the input of an action or not, without a subset of a directory being the input of an action.
>
> In bazel, this is not an issue when handling non-tree artifacts, because for normal files we can decide to include or not include the file as input to an action.  For tree artifacts, when we add a tree artifact as an input to an action, the whole directory gets added.




CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144596/new/

https://reviews.llvm.org/D144596



More information about the llvm-commits mailing list