[llvm] [Transforms][IPO] Add func suffix in ArgumentPromotion and DeadArgume… (PR #105742)
via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 29 11:40:36 PDT 2024
yonghong-song wrote:
> > > If the name is using C++ mangling, does the demangler produce something reasonable? If not, can we modify the name in a demangler-friendly way?
> > > I am not sure about this as my context is linux kernel which is C based. How do I know demangler produce something reasonable?
>
> Take a small C++ function that gets optimized, run the resulting symbol through c++filt, see what you get.
Instead of creating a C++ function to trigger desired optimization, I actually tried to build clang itself with gcc (11.4.1) and clang (with adding func suffix as in this pull request). The following are two examples, a symbol generated from gcc and another symbol generated from clang with adding func suffix:
```
$ cat t
_ZN5clang19RecursiveASTVisitorIN12_GLOBAL__N_119PluralMisuseChecker13MethodCrawlerEE14TraverseIfStmtEPNS_6IfStmtEPN4llvm15SmallVectorImplINS7_14PointerIntPairIPNS_4StmtELj1EbNS7_21PointerLikeTypeTraitsISB_EENS7_18PointerIntPairInfoISB_Lj1ESD_EEEEEE.part.0.constprop.0.isra.0
_ZZNK4llvm20AMDGPUTargetLowering29isDesirableToCommuteWithShiftEPKNS_6SDNodeENS_12CombineLevelEENK3$_0clENS_7SDValueES6_.argprom.argelim
$ c++filt < t
clang::RecursiveASTVisitor<(anonymous namespace)::PluralMisuseChecker::MethodCrawler>::TraverseIfStmt(clang::IfStmt*, llvm::SmallVectorImpl<llvm::PointerIntPair<clang::Stmt*, 1u, bool, llvm::PointerLikeTypeTraits<clang::Stmt*>, llvm::PointerIntPairInfo<clang::Stmt*, 1u, llvm::PointerLikeTypeTraits<clang::Stmt*> > > >*) [clone .part.0] [clone .constprop.0] [clone .isra.0]
llvm::AMDGPUTargetLowering::isDesirableToCommuteWithShift(llvm::SDNode const*, llvm::CombineLevel) const::$_0::operator()(llvm::SDValue, llvm::SDValue) const [clone .argprom] [clone .argelim]
```
So I think we should be okay for C++ symbols. The suffix itself seems also reasonable, short enough to explain what transformation it has done.
>
> > > If you try to set a breakpoint on a function, how will a debugger react to the changed name?
> > > Again, good question. But gcc already have functions like .constprop., .isra.. Do gdb/lldb handle those functions?
>
> I think maybe we end up doing a wildcard search that finds them? Not sure.
I tried an example on gdb when symbol has a gcc suffix.
```
[~/tmp5]$ cat main.c
extern int g(int);
int main(void) {
return g(5);
}
[~/tmp5]$ vi test.c
[~/tmp5]$ cat test.c
__attribute__((noinline)) static int f(int a, int* b) { return a + *b; }
int g(int x) { return f(x + 1, &x); }
[~/tmp5]$ cat main.c
extern int g(int);
int main(void) {
return g(5);
}
$ gcc -O2 -g test.c main.c && llvm-readelf -s a.out | grep f | grep isra
47: 0000000000401120 4 FUNC LOCAL DEFAULT 12 f.isra.0
```
With gdb,
```
(gdb) break f.isra.0
Breakpoint 1 at 0x401120: file test.c, line 1.
(gdb) run
Starting program: /home/yhs/tmp5/a.out
Breakpoint 1, f (a=6, b=<optimized out>, b=<optimized out>) at test.c:1
1 __attribute__((noinline)) static int f(int a, int* b) { return a + *b; }
(gdb) disassemble
Dump of assembler code for function f:
=> 0x0000000000401120 <+0>: lea (%rdi,%rsi,1),%eax
0x0000000000401123 <+3>: retq
End of assembler dump.
(gdb) s
0x00007ffff7c29590 in __libc_start_call_main () from /lib64/libc.so.6
(gdb) s
Single stepping until exit from function __libc_start_call_main,
which has no line number information.
[Inferior 1 (process 4133944) exited with code 013]
(gdb)
The program is not being run.
(gdb) del 1
(gdb) break f
Breakpoint 2 at 0x401120: file test.c, line 1.
(gdb) run
Starting program: /home/yhs/tmp5/a.out
Breakpoint 2, f (a=6, b=<optimized out>, b=<optimized out>) at test.c:1
1 __attribute__((noinline)) static int f(int a, int* b) { return a + *b; }
(gdb) disassemble
Dump of assembler code for function f:
=> 0x0000000000401120 <+0>: lea (%rdi,%rsi,1),%eax
0x0000000000401123 <+3>: retq
End of assembler dump.
(gdb) s
0x00007ffff7c29590 in __libc_start_call_main () from /lib64/libc.so.6
(gdb) s
Single stepping until exit from function __libc_start_call_main,
which has no line number information.
[Inferior 1 (process 4136424) exited with code 013]
```
So in gdb, you can provide the final symbol name in symbol table, or you can provide the original source-code level symbol name and gdb seems trying to find it with '.<...>' suffix as you suggested.
lldb seems able to do the same thing as well.
```
(lldb) b f
Breakpoint 1: where = a.out`f at test.c:1:66, address = 0x0000000000401120
(lldb) b f.isra.0
Breakpoint 2: where = a.out`f at test.c:1:66, address = 0x0000000000401120
```
So I think newly added suffix feature should work for gdb/lldb as well. Note that llvm already has some suffix likes
<...>.llvm.<hash> for thinlto, <...>.<n> for fulllto, <...>.specialized.<n> etc.
https://github.com/llvm/llvm-project/pull/105742
More information about the llvm-commits
mailing list