[llvm] [AMDGPU] Move `AMDGPUAttributorPass` to full LTO post link stage (PR #102086)

Fri Aug 9 12:04:34 PDT 2024

yxsamliu wrote:

> > Blender assumes -fno-gpu-rdc, which does not use LTO. It uses the default optimization pipeline.
> > Maybe the failure is due to removing the pass from the default optimization pipeline?
> 
> I thought we do even for no-gpu-rdc? This is from running blender test on my side:
> 
> ```
> ...
> "/llvm/bin/lld" -flavor gnu -m elf64_amdgpu --no-undefined -shared -plugin-opt=-amdgpu-internalize-symbols -plugin-opt=mcpu=gfx90a -plugin-opt=O3 --lto-CGO3 --whole-archive -o /tmp/kernel-gfx90a-824510.out /tmp/kernel-gfx90a-1b3add.o --no-whole-archive
> ...
> ```

For fno-gpu-rdc case, the input to this lld invocation is a relocatable object file. No LTO is performed.

https://github.com/llvm/llvm-project/pull/102086