[PATCH] D138560: [lld][LTO] Add assembly output to LTO save-temps

Pierre van Houtryve via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 30 00:46:14 PST 2022


Pierre-vh added a comment.

In D138560#3958723 <https://reviews.llvm.org/D138560#3958723>, @MaskRay wrote:

> In D138560#3952769 <https://reviews.llvm.org/D138560#3952769>, @Pierre-vh wrote:
>
>> In D138560#3951995 <https://reviews.llvm.org/D138560#3951995>, @MaskRay wrote:
>>
>>> In D138560#3951169 <https://reviews.llvm.org/D138560#3951169>, @yaxunl wrote:
>>>
>>>> In D138560#3950141 <https://reviews.llvm.org/D138560#3950141>, @MaskRay wrote:
>>>>
>>>>> I am unsure I am convinced with the motivation. Say you have two bitcode files a.o and b.o, can you just use `clang++ -fuse-ld=lld a.o b.o -Wl,--lto-emit-asm`? This can be used with `-Wl,--save-temps` as well. The output is named in term of the `-o` output file name, instead of `*.s`, but the slight difference doesn't warrant adding significant more complexity to LTOBackend.cpp as this patch does.
>>>>
>>>> We use lld to do device LTO in the HIP toolchain. clang driver extracts device bitcode and passes them to lld. When -save-temps is used, clang driver adds -save-temps to the options passed to lld. It is not convenient for the users to rerun lld command to get the assembly. It is better for lld to generate the assembly when -save-temps is used because: 1. it avoids redundant work, especially device LTO is usually the most time-consuming part 2. to keep the temporary file names consistent 3. it is a useful feature for lld itself
>>>
>>> `clang --save-temps` does not pass `--save-temps` to the linker. The user can use `-Wl,--save-temps` which is more orthogonal. I can use `-Wl,--lto-emit-asm` today to get assembly output.
>>>
>>> In D138560#3950224 <https://reviews.llvm.org/D138560#3950224>, @Pierre-vh wrote:
>>>
>>>> In D138560#3950141 <https://reviews.llvm.org/D138560#3950141>, @MaskRay wrote:
>>>>
>>>>> I am unsure I am convinced with the motivation. Say you have two bitcode files a.o and b.o, can you just use `clang++ -fuse-ld=lld a.o b.o -Wl,--lto-emit-asm`? This can be used with `-Wl,--save-temps` as well. The output is named in term of the `-o` output file name, instead of `*.s`, but the slight difference doesn't warrant adding significant more complexity to LTOBackend.cpp as this patch does.
>>>>
>>>> With your suggestion, does it also generate the final, linked output file? Or just the .S?
>>>> The idea is to be able to get both alongside each other. We'd also like to have comments in the assembly file about things like register usage - the ones you get when you codegen to assembly, not when you disassemble.
>>>
>>> There is an empty output file to satisfy some build tools' requirement that an output is already emitted. You can find assembly files in `${output}1`. As mentioned, they are just not named `*.S`.
>>>
>>>> Though, I am surprised that you see this as adding significant complexity, I tried to not make it too intrusive. Is it the whole change that's problematic, or just some part of it?
>>>> If yes, maybe I can re-do the parts you find problematic to simplify them. The way I see it, this kind of patch should just be a small feature addition that shouldn't increase complexity too much. If you think complexity has increased significantly then I definitely failed somewhere and I can try to correct it before discarding the idea entirely.
>>>
>>> This is significant complexity because you do codegen twice `// Doing codegen twice may seem inefficient, but this is designed as a debug `, which you probably would agree as well that this is inelegant.
>>>
>>> Well, the real question is when `--lto-emit-asm` exists, why bother with another mode and duplicating the existing functionality. So far I fail to see a convincing argument that this new mode is useful.
>>
>> It does pass it for HIP code when using -fgpu-rdc, see `HIPAMD.cpp:151`. This would be the primary use case because when building the same HIP code without -fgpu-rdc, we get a .S temporary, but if we pass fgpu-rdc, that temporary is no longer there.
>
> It seems that the logic in  `HIPAMD.cpp:151` can simply be removed.
>
>>> There is an empty output file to satisfy some build tools' requirement that an output is already emitted. You can find assembly files in ${output}1. As mentioned, they are just not named *.S.
>>
>> This is not about build systems at all, it's for debuggability.  An empty file is not useful to us, we need the assembly with the comments.
>
> I think you misunderstand my comment. The assembly output of --lto-emit-asm is of course not empty. The object file output is empty which is fine for your case.
> If you want the object file output as well, re-run LTO without --lto-emit-asm. It's similar to what your patch intends to do, but places less burden to libLTO.



> It seems that the logic in HIPAMD.cpp:151 can simply be removed.

Why? How would it solve the issue?

> If you want the object file output as well, re-run LTO without --lto-emit-asm. It's similar to what your patch intends to do, but places less burden to libLTO.

Are you suggesting that we do this at the driver level instead? So create two linker jobs there if --save-temps is passed to get both a .o and .s?
Or that in this specific use case we don't make any code change and just work around the issue?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138560/new/

https://reviews.llvm.org/D138560



More information about the llvm-commits mailing list