[PATCH] D115850: [LTO][codegen] Add TargetLibraryInfoWrapperPass initially

Craig Topper via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 23 08:49:04 PST 2021


craig.topper added a comment.

In D115850#3202076 <https://reviews.llvm.org/D115850#3202076>, @FreddyYe wrote:

> For example this patch can fix such performance drop on Linux:
>
>   $ cat foo.h                                                                                                                                                                                                            
>   double foo(double a);
>   $ cat foo.c                                                                                                                                                                                                            
>   #include <math.h>
>   #include "foo.h"
>   
>   double foo(double a){
>       return sqrt(a);
>   }
>   $ cat main.c
>   #include <stdio.h>
>   #include "foo.h"
>   int main() { printf("%lf\n", foo(200));}
>   $ cat foo.sh                                                                                                                                                                                                           
>   clang -c -o foo.o -Ofast -flto foo.c                                                                                                                                                                                                                                    
>   clang -c -o main.o -Ofast -flto main.c                                                                                                                                                                                                                                  
>   clang -Ofast -flto foo.o main.o -lm                                                                                                                                                                                                                                     
>   objdump -d a.out | less
>   $ sh foo.sh
>
> Without this patch, when enable -flto, it will generate `call __sqrt_finite`, when disable -flto , it will generate `sqrtsd %xmm0,%xmm0`. The later version is faster. I don't know how to add lto dependent lit test. Hi @aeubanks, do you know how?

Why would TLI be involved in that test. Isn't the sqrt call converted to llvm.sqrt by the frontend?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115850/new/

https://reviews.llvm.org/D115850



More information about the llvm-commits mailing list