[LLVMdev] FYI: Changing RunSafely.sh to only track user time

Mon Apr 19 10:49:26 PDT 2010

Jakob Stoklund Olesen wrote:
> On Apr 18, 2010, at 11:08 PM, John Criswell wrote:
>
>   
>> Second, why are you only interested in user time?  The reason why we had 
>> RunSafely.sh measure user + system time is that it gives a more accurate 
>> depiction of how well an optimization works.  If a program spends most 
>> of its time in the OS, increasing speed in user-space doesn't gain us 
>> much.  If a transform decreases user time but increases system time, 
>> then measuring only user time may show a speedup when measuring 
>> user+system will show a loss.
>>     
>
> What kind of optimization might change the system time?
>   

Inlining can (increased code size may affect demand paging).  Libcall 
optimizations can (they may change the amount of work done in userspace 
vs. kernel space).  Automatic pool allocation can (it may change 
frequency of calls into the OS for memory allocation as well as paging 
and cache behavior).   Anything that changes cache behavior can (because 
you can kick out OS data and code).

Other transforms are not optimizations, and understanding how they add 
overhead to both user and kernel time is important.  SAFECode can 
increase calls to mmap()/mremap() when dangling pointer detection is 
enabled.  Dynamic slicing increases time in the OS for trace file 
creation and trace file consultation; measuring solely user time may 
give a very inaccurate depiction of its execution time.

There's also the fact that some of the experiments I run compare 
compilation techniques to binary translation techniques.  For example, 
I've compared Valgrind to SAFECode; it wouldn't surprise me if each one 
triggers very different behaviors in the kernel.  I use the test-suite 
infrastructure to run these.
> The problem with measuring system time is that it can depend on many variable factors that have nothing to do with the process being tested. For instance, the number of files on disk, the amount of free space on disk, the total number of processes on the system, the amount of free memory pages on the system, and the size of the buffer cache can all affect how much work a system call has to do.
>   

That's a good point, although I think that most of the examples above 
are constants on a given system (more or less).  Going further, I'm 
guessing that the real reason some people want the change is so that 
they can compare performance numbers across *different* environments 
(e.g., for comparisons against GCC).  Is this correct?

That seems like a reasonable thing to want, but it doesn't change the 
fact that I and others need to measure user+system time because we're 
doing transforms that can change system time (or, at the very least, we 
have to prove that our transforms don't change system time appreciably).

Getting back to the original issue at hand, if Daniel wants to track 
user time only in the nightly tester experiments, I think he can do that 
by changing the nightly tester Makefiles (again, I think the *.time 
files generated by RunSafely.sh include all the relevant data).  I think 
it's just a matter of grep'ing the correct value out of the .time file.  
If that doesn't work, he can enhance RunSafely to record just user time; 
just as long as there's a command line option I can use to measure 
user+system, I'm happy.

-- John T.

> /jakob
>
>