[PATCH] Make LLVM profiling thread-safe

Matthew Dempsky matthew at dempsky.org
Wed Jun 26 18:28:53 PDT 2013


On Mon, Jun 24, 2013 at 03:25:14PM -0700, Matthew Dempsky wrote:
> On Mon, Jun 24, 2013 at 12:17:17PM -0700, Matthew Dempsky wrote:
> > Instead of emitting separate read-modify-write instructions to
> > increment the profiling counters, an monotonically ordered atomic add
> > instruction should be emitted.
> 
> Oops, need to fix the tests too!

Ping?  Any interest in this fix?  (I'm not an LLVM committer, btw.)

To demonstrate this is an actual issue, I wrote this sample program:

    #include <pthread.h>
    
    void
    noop(int i)
    {
    }
    
    void *
    worker(void *arg)
    {
    	int i, j;
    
    	for (i = 0; i < 1000000; i++)
    		noop(i);
    
    	return (NULL);
    }
    
    int
    main()
    {
    	int i;
    	pthread_t thr[10];
    
    	for (i = 0; i < 10; i++)
    		if (pthread_create(&thr[i], NULL, worker, NULL))
    			return (1);
    	for (i = 0; i < 10; i++)
    		if (pthread_join(thr[i], NULL))
    			return (1);
    
    	return (0);
    }

Obviously, noop() should be entered exactly 10 million times, but
that's not what llvm-prof tells me currently:

$ clang -emit-llvm -c stress.c
$ opt -insert-edge-profiling < stress.o > stress-edge.bc
$ clang -o stress stress-edge.bc -L/usr/local/lib -lprofile_rt -lpthread
/usr/local/lib/libprofile_rt.so: warning: strcpy() is almost always misused, please use strlcpy()
/usr/local/lib/libprofile_rt.so: warning: strcat() is almost always misused, please use strlcat()
$ rm -f llvmprof.out
$ ./stress
$ llvm-prof stress-edge.bc llvmprof.out
===-------------------------------------------------------------------------===
LLVM profiling output for execution:
  

===-------------------------------------------------------------------------===
Function execution frequencies:

 ##   Frequency
  1. 5.3e+06/5.31039e+06 noop
  2.    10/5.31039e+06 worker
  3.     1/5.31039e+06 main

===-------------------------------------------------------------------------===
Top 20 most frequently executed basic blocks:

 ##      %% 	Frequency
  1. 25.9849% 5421453/2.08638e+07	worker() - for.inc
  2. 25.4526% 5310376/2.08638e+07	noop() - entry
  3. 24.4395% 5099022/2.08638e+07	worker() - for.body
  4. 24.1225% 5032868/2.08638e+07	worker() - for.cond
  5. 5.27228e-05%    11/2.08638e+07	main() - for.cond
  6. 5.27228e-05%    11/2.08638e+07	main() - for.cond1
  7. 4.79298e-05%    10/2.08638e+07	main() - for.inc10
  8. 4.79298e-05%    10/2.08638e+07	main() - if.end9
  9. 4.79298e-05%    10/2.08638e+07	main() - for.body3
 10. 4.79298e-05%    10/2.08638e+07	main() - for.inc
 11. 4.79298e-05%    10/2.08638e+07	main() - if.end
 12. 4.79298e-05%    10/2.08638e+07	main() - for.body
 13. 4.79298e-05%    10/2.08638e+07	worker() - for.end
 14. 4.79298e-05%    10/2.08638e+07	worker() - entry
 15. 4.79298e-06%     1/2.08638e+07	main() - for.end
 16. 4.79298e-06%     1/2.08638e+07	main() - entry
 17. 4.79298e-06%     1/2.08638e+07	main() - for.end12
 18. 4.79298e-06%     1/2.08638e+07	main() - return


With my patch, this is the output I get instead:

$ clang -emit-llvm -c stress.c
$ opt -insert-edge-profiling < stress.o > stress-edge.bc
$ clang -o stress stress-edge.bc -L/usr/local/lib -lprofile_rt -lpthread
/usr/local/lib/libprofile_rt.so: warning: strcpy() is almost always misused, please use strlcpy()
/usr/local/lib/libprofile_rt.so: warning: strcat() is almost always misused, please use strlcat()
$ rm -f llvmprof.out
$ ./stress
$ llvm-prof stress-edge.bc llvmprof.out
===-------------------------------------------------------------------------===
LLVM profiling output for execution:
  

===-------------------------------------------------------------------------===
Function execution frequencies:

 ##   Frequency
  1. 1e+07/1e+07 noop
  2.    10/1e+07 worker
  3.     1/1e+07 main

===-------------------------------------------------------------------------===
Top 20 most frequently executed basic blocks:

 ##      %% 	Frequency
  1.    25% 10000010/4.00001e+07	worker() - for.cond
  2. 24.9999% 10000000/4.00001e+07	noop() - entry
  3. 24.9999% 10000000/4.00001e+07	worker() - for.body
  4. 24.9999% 10000000/4.00001e+07	worker() - for.inc
  5. 2.74999e-05%    11/4.00001e+07	main() - for.cond
  6. 2.74999e-05%    11/4.00001e+07	main() - for.cond1
  7. 2.49999e-05%    10/4.00001e+07	main() - for.inc10
  8. 2.49999e-05%    10/4.00001e+07	main() - if.end9
  9. 2.49999e-05%    10/4.00001e+07	main() - for.body3
 10. 2.49999e-05%    10/4.00001e+07	main() - for.inc
 11. 2.49999e-05%    10/4.00001e+07	main() - if.end
 12. 2.49999e-05%    10/4.00001e+07	main() - for.body
 13. 2.49999e-05%    10/4.00001e+07	worker() - for.end
 14. 2.49999e-05%    10/4.00001e+07	worker() - entry
 15. 2.49999e-06%     1/4.00001e+07	main() - for.end
 16. 2.49999e-06%     1/4.00001e+07	main() - entry
 17. 2.49999e-06%     1/4.00001e+07	main() - for.end12
 18. 2.49999e-06%     1/4.00001e+07	main() - return



More information about the llvm-commits mailing list