[PATCH] Make LLVM profiling thread-safe
Matthew Dempsky
matthew at dempsky.org
Wed Jun 26 18:28:53 PDT 2013
On Mon, Jun 24, 2013 at 03:25:14PM -0700, Matthew Dempsky wrote:
> On Mon, Jun 24, 2013 at 12:17:17PM -0700, Matthew Dempsky wrote:
> > Instead of emitting separate read-modify-write instructions to
> > increment the profiling counters, an monotonically ordered atomic add
> > instruction should be emitted.
>
> Oops, need to fix the tests too!
Ping? Any interest in this fix? (I'm not an LLVM committer, btw.)
To demonstrate this is an actual issue, I wrote this sample program:
#include <pthread.h>
void
noop(int i)
{
}
void *
worker(void *arg)
{
int i, j;
for (i = 0; i < 1000000; i++)
noop(i);
return (NULL);
}
int
main()
{
int i;
pthread_t thr[10];
for (i = 0; i < 10; i++)
if (pthread_create(&thr[i], NULL, worker, NULL))
return (1);
for (i = 0; i < 10; i++)
if (pthread_join(thr[i], NULL))
return (1);
return (0);
}
Obviously, noop() should be entered exactly 10 million times, but
that's not what llvm-prof tells me currently:
$ clang -emit-llvm -c stress.c
$ opt -insert-edge-profiling < stress.o > stress-edge.bc
$ clang -o stress stress-edge.bc -L/usr/local/lib -lprofile_rt -lpthread
/usr/local/lib/libprofile_rt.so: warning: strcpy() is almost always misused, please use strlcpy()
/usr/local/lib/libprofile_rt.so: warning: strcat() is almost always misused, please use strlcat()
$ rm -f llvmprof.out
$ ./stress
$ llvm-prof stress-edge.bc llvmprof.out
===-------------------------------------------------------------------------===
LLVM profiling output for execution:
===-------------------------------------------------------------------------===
Function execution frequencies:
## Frequency
1. 5.3e+06/5.31039e+06 noop
2. 10/5.31039e+06 worker
3. 1/5.31039e+06 main
===-------------------------------------------------------------------------===
Top 20 most frequently executed basic blocks:
## %% Frequency
1. 25.9849% 5421453/2.08638e+07 worker() - for.inc
2. 25.4526% 5310376/2.08638e+07 noop() - entry
3. 24.4395% 5099022/2.08638e+07 worker() - for.body
4. 24.1225% 5032868/2.08638e+07 worker() - for.cond
5. 5.27228e-05% 11/2.08638e+07 main() - for.cond
6. 5.27228e-05% 11/2.08638e+07 main() - for.cond1
7. 4.79298e-05% 10/2.08638e+07 main() - for.inc10
8. 4.79298e-05% 10/2.08638e+07 main() - if.end9
9. 4.79298e-05% 10/2.08638e+07 main() - for.body3
10. 4.79298e-05% 10/2.08638e+07 main() - for.inc
11. 4.79298e-05% 10/2.08638e+07 main() - if.end
12. 4.79298e-05% 10/2.08638e+07 main() - for.body
13. 4.79298e-05% 10/2.08638e+07 worker() - for.end
14. 4.79298e-05% 10/2.08638e+07 worker() - entry
15. 4.79298e-06% 1/2.08638e+07 main() - for.end
16. 4.79298e-06% 1/2.08638e+07 main() - entry
17. 4.79298e-06% 1/2.08638e+07 main() - for.end12
18. 4.79298e-06% 1/2.08638e+07 main() - return
With my patch, this is the output I get instead:
$ clang -emit-llvm -c stress.c
$ opt -insert-edge-profiling < stress.o > stress-edge.bc
$ clang -o stress stress-edge.bc -L/usr/local/lib -lprofile_rt -lpthread
/usr/local/lib/libprofile_rt.so: warning: strcpy() is almost always misused, please use strlcpy()
/usr/local/lib/libprofile_rt.so: warning: strcat() is almost always misused, please use strlcat()
$ rm -f llvmprof.out
$ ./stress
$ llvm-prof stress-edge.bc llvmprof.out
===-------------------------------------------------------------------------===
LLVM profiling output for execution:
===-------------------------------------------------------------------------===
Function execution frequencies:
## Frequency
1. 1e+07/1e+07 noop
2. 10/1e+07 worker
3. 1/1e+07 main
===-------------------------------------------------------------------------===
Top 20 most frequently executed basic blocks:
## %% Frequency
1. 25% 10000010/4.00001e+07 worker() - for.cond
2. 24.9999% 10000000/4.00001e+07 noop() - entry
3. 24.9999% 10000000/4.00001e+07 worker() - for.body
4. 24.9999% 10000000/4.00001e+07 worker() - for.inc
5. 2.74999e-05% 11/4.00001e+07 main() - for.cond
6. 2.74999e-05% 11/4.00001e+07 main() - for.cond1
7. 2.49999e-05% 10/4.00001e+07 main() - for.inc10
8. 2.49999e-05% 10/4.00001e+07 main() - if.end9
9. 2.49999e-05% 10/4.00001e+07 main() - for.body3
10. 2.49999e-05% 10/4.00001e+07 main() - for.inc
11. 2.49999e-05% 10/4.00001e+07 main() - if.end
12. 2.49999e-05% 10/4.00001e+07 main() - for.body
13. 2.49999e-05% 10/4.00001e+07 worker() - for.end
14. 2.49999e-05% 10/4.00001e+07 worker() - entry
15. 2.49999e-06% 1/4.00001e+07 main() - for.end
16. 2.49999e-06% 1/4.00001e+07 main() - entry
17. 2.49999e-06% 1/4.00001e+07 main() - for.end12
18. 2.49999e-06% 1/4.00001e+07 main() - return
More information about the llvm-commits
mailing list